Everything, altogether, all at once: Addressing data challenges when measuring speech intelligibility through entropy scores

Authors
Affiliations

University of Antwerp

University of Antwerp

University of Antwerp

Published

April 4, 2024

Keywords

Bayesian analysis, speech intelligibility, bounded outcomes, clustering, measurement error, outliers, heteroscedasticity, generalized linear latent and mixed models, robust regression models.

1 Aim

The purpose of this walk-through is to improve the transparency and replicability of the analysis for the study Everything, altogether, all at once: Addressing data challenges when measuring speech intelligibility through entropy scores (in press.). This digital document contains all the code and materials utilized in the study. Furthermore, the walk-through meticulously follows the When-to-Worry-and-How-to-Avoid-the-Misuse-of-Bayesian-Statistics checklist (WAMBS checklist) developed by Depaoli & van de Schoot (2017). The checklist outlines the ten crucial points that need careful scrutiny when employing Bayesian inference procedures.

WAMBS checklist

Questionnaire outlining the ten crucial points that need careful scrutiny when employing Bayesian inference procedures, with the ultimate goal of enhancing the transparency and replicability of Bayesian analysis (Depaoli & van de Schoot, 2017).

Code
# load packages
libraries = c('stringr','RColorBrewer','dplyr','knitr',
              'rethinking','rstan','StanHeaders','runjags')
sapply(libraries, require, character.only=T)
Code
# load functions
main_dir = '/home/josema/Desktop/1. Work/1 research/PhD Antwerp/#thesis/paper1'
source( file.path( main_dir, 'walkthrough/code', 'user-defined-functions.R') )

2 Organization

In this walk-through, Section 3 introduces various background topics that are relevant to the present study. These topics enable readers to progress smoothly through this research. Specifically, Section 3.1 provides a brief explanation of how Bayesian inference procedures work and their importance for this research. Section 3.2 is devoted to explaining the difference between two particular distributions, the normal and the beta-proportion distribution, and their role on modeling bounded data. Section 3.3 explains the (generalized) linear mixed models, elaborating on their role in modeling (non)normal clustered and bounded data. Section 3.4 illustrates the concept of measurement error and the role of latent variables to overcome the problems arising from it. Lastly, Section 3.5 explains the effects of the data distributional departures on the parameter estimates, and its importance for this research.

The specific analyses for this study are elaborated from section Section 4 onwards. Particularly, Section 4 elaborates on the general context, gaps and main purpose of the study. Section 5 introduces the research questions that guide this study. Section 6 explores the data and its implications. Section 7 thoroughly develops the methods to analyze the data. Section 8 provides answers to the research question at hand. Section 9 discusses the findings, limitations and future research derived from this study. Lastly, Section 10 provides the concluding thoughts for the study.

The R packages utilized in the production of this document can be divided in three groups. First, the packages utilized to generate the walk-through: RColorBrewer (Neuwirth, 2022) and quarto (Allaire, Teague, Scheidegger, Xie, & Dervieux, 2022). Second, the packages used for the handling the data: stringr (H. Wickham, 2022), dplyr (Hadley Wickham, François, Henry, Müller, & Vaughan, 2023), tidyverse (Hadley Wickham et al., 2019), and reshape2 (H. Wickham, 2007). Lastly, the packages used for the Bayesian estimation: coda (Plummer, Best, Cowles, & Vines, 2006), loo (A. Vehtari et al., 2023; A. Vehtari, Simpson, Gelman, Yao, & Gabry, 2021b), cmdstanr (Gabry & Češnovar, 2022), rstan (Stan Development Team, 2020), runjags (Denwood, 2016), and rethinking (McElreath, 2021).

3 Interludes

3.1 Bayesian inference

3.1.1 Theory

Bayesian inference is an approach to statistical modeling and inference that is primarily based on the Bayes’ theorem. The procedure aims to derive appropriate inference statements about a set of parameters by revising and updating their occurrence probabilities in light of new evidence (Everitt & Skrondal, 2010). The procedure consists of defining the model assumptions in the form of a likelihood for the outcome and a set of prior distributions for the parameters of interest. Upon observing empirical data, these priors undergo updating to posterior distributions following Bayes’ rule (Jeffreys, 1998), from which the statistical inferences are derived 1. As an example, a simple linear regression model with a parameter \beta can be encoded under the Bayesian inference paradigm in the following form:

Bayesian inference

Approach to statistical modeling and inference, that aims to derive appropriate inference statements about one or a set of parameters by revising and updating their probabilities in light of new evidence (Everitt & Skrondal, 2010).

\begin{align*} P(\beta | Y, X ) &= \frac{ P( Y | \beta, X ) \cdot P( \beta ) }{ P( Y ) } \end{align*} \tag{1}

where P( Y| \beta, X ) defines the likelihood of the outcome, which represents the assumed probability distribution for the outcome Y, given the parameter \beta and covariate X. This is the distribution that describes the assumption about the underlying processes that give rise to the data (Everitt & Skrondal, 2010). P( \beta ) defines the prior distribution of the parameter \beta. A prior is a probability distribution summarizing the information about a parameter known or assumed before observing any empirical data (Everitt & Skrondal, 2010). Lastly, P( Y ) defines the probability distribution of the data, which represents the evidence of the observed empirical data. Consequently, the posterior distribution of the parameter P( \beta | Y, X ) describes the probability distribution of \beta after observing empirical data.

Likelihood

probability distribution that describes the assumption about the underlying processes that give rise to the data (Everitt & Skrondal, 2010).

Prior distribution

Probability distribution summarizing the information about a parameter known or assumed before observing any empirical data (Everitt & Skrondal, 2010).

Posterior distribution

Probability distribution summarizing the information about a parameter after observing empirical data (Everitt & Skrondal, 2010).

Before implementing Bayesian inference procedures, two important concepts related to Equation 1 need to be understood. First, the evidence of the empirical data P(Y) serves as a normalizing constant. This just says that the numerator in the equation is re-scaled by a constant obtained from calculating P(Y). Consequently, without loosing generalization, the equation can be succinctly rewritten in the following form:

\begin{align*} P(\beta | Y, X ) &\propto P( Y | \beta, X ) \cdot P( \beta ) \\ \end{align*} \tag{2}

where \propto denotes the proportional symbol. This implies that the posterior distribution of \beta is proportional (up to a constant) to the multiplication of the outcome’s likelihood and the parameter’s prior distribution. This definition makes the calculation of posterior distributions easier, by separating the parameter’s updating process from the integration of new empirical data (this will be clearly seen in the code provided in Section 3.1.3).

Second, a dataset usually has multiple observations of the outcome Y and covariates X, in the form of y_{i} and x_{i}. Therefore, by law of probabilities, and assuming independence among the observations, the likelihood of a dataset can be rewritten as the product of all individual likelihoods. Consequently, Equation 2 can also be rewritten as follows:

\begin{align*} P(\beta | Y, X ) &\propto \prod_{i=1}^{n} P( y_{i} | \beta, x_{i} ) \cdot P( \beta ) \end{align*} \tag{3}

3.1.2 Estimation methods

Several methods within the Bayesian inference procedures can be utilized to estimate the posterior distribution of the parameter, and most of these fall into the category of Markov Chain Monte Carlo methods (MCMC). MCMC are methods to indirectly simulate random observations from probability distributions using stochastic processes (Everitt & Skrondal, 2010) 2. However, when the parameters of interest are not large in number, a useful pedagogical method to produce the posterior distribution is the grid approximation method. Through this method, an excellent approximation of the parameter’s posterior distribution can be achieved by considering a finite candidate list of parameter values. This method is used in Section 3.1.3 to illustrate how the Bayesian inference works 3.

Markov Chain Monte Carlo (MCMC)

Methods to indirectly simulate random observations from probability distributions using stochastic processes (Everitt & Skrondal, 2010).

Grid approximation

Method to indirectly simulate random observations from low dimensional continuous probability distributions, by considering a finite candidate list of parameter values (McElreath, 2020).

3.1.3 How does it work?

A simple Bayesian linear regression model can be written in the following form:

\begin{align*} y_{i} &= \beta \cdot x_{i} + e_{i} \\ e_{i} &\sim \text{Normal}( 0, 1 ) \\ \beta &\sim \text{Uniform}( -20, +20 ) \end{align*} where y_{i} denotes the outcome’s observation i, \beta the expected effect of the observed covariate x_{i} on the outcome, and e_{i} the outcome’s residual in observation i. Furthermore, the residuals e_{i} are assumed to follow a normal distribution with mean zero and standard deviation equal to one. Lastly, prior to observe any data, it is assumed that \beta is uniformly distributed within the range of [-20,+20]. This prior implies that equal probabilities are assigned to all values of \beta within this range.

However, a more convenient generalized manner to represent the same linear regression model is as follows:

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &\sim \text{Uniform}( -20, +20 ) \end{align*} In this definition, the component of the Bayesian inference procedure detailed in Section 3.1.1 are more easily spotted. First, about the likelihood, the outcome is assumed to be normally distributed with mean \mu_{i} and standard deviation equal to one. Second, it is assumed that \beta has a uniform prior within the range of [-20,+20]. Moreover, the equations reveal that the mean of the outcome \mu_{i} is modeled by a linear predictor composed of the covariate x_{i} and its effect on the outcome \beta.

For illustration purposes, a simulated regression with n=100 observations was generated assuming \beta=0.2. Figure 1 shows the scatter plot of the generated data (see code below). The grid approximation method is used to generate random observations from the posterior distribution of \beta. Two noteworthy results emerge from the approach. Firstly, once the posterior distribution is generated, various summaries can be used to make inferences about the parameter of interest (refer to the code output below). Secondly, when considering a dataset with n=100 observations, the influence of the prior on the posterior distribution of \beta is negligible. Specifically, prior to observe any data, assuming that \beta could take any value within the range of [-20,+20] with equal probability (left panel of Figure 2) did not have a substantial impact on the distribution of \beta after empirical data was observed (right panel of Figure 2).

Code
set.seed(12345)
n = 100

b = 0.2
x = rnorm( n=n, mean=0, sd=1 )

mu_y = b*x
y = rnorm( n=n, mean=mu_y, sd=1 )
1
replication seed
2
simulation sample size
3
covariate effect
4
covariate simulation
5
linear predictor on outcome mean
6
outcome simulation
Code
# grid approximation
Ngp = 1000

b_cand = seq( from=-20, to=20, length.out=Ngp )

udf = function(i){ b_cand[i]*x }
mu_y = sapply( 1:length(b_cand), udf )

udf = function(i){ prod( dnorm( y, mean=mu_y[,i], sd=1 ) ) }
y_lik = sapply( 1:length(b_cand), udf )

b_prior = rep( 1/40, length(b_cand) )

b_prop = y_lik * b_prior
b_post = b_prop / sum(b_prop)
1
number of points in candidate list
2
candidate list for parameter
3
user defined function: linear predictor for each candidate
4
calculation of the linear predictor for each candidate
5
user defined function: product of individual observation likelihoods
6
outcome data likelihood
7
uniform prior distribution for parameter (min=-20, max=20)
8
proportional posterior distribution for parameter
9
posterior distribution for parameter
Code
paste0( 'true beta = ', b )

b_exp = sum( b_cand * b_post )
paste0( 'estimated beta (expectation) = ', round(b_exp, 3) )

b_max = b_cand[ b_post==max(b_post) ]
paste0( 'estimated beta (maximum probability) = ', round(b_max, 3) )

b_var = sqrt( sum( ( (b_cand-b_exp)^2 ) * b_post ) )
paste0( 'estimated beta (standard deviation) = ', round(b_var, 3) )

b_prob = sum( b_post[ b_cand > 0 ] )
paste0( 'P(estimated beta > 0) = ', round(b_prob, 3) )
1
true values for the parameter
2
expected value for the parameter
3
maximum probability value for the parameter
4
standard deviation for the parameter
5
probability that the parameter is greater than zero
[1] "true beta = 0.2"
[1] "estimated beta (expectation) = 0.299"
[1] "estimated beta (maximum probability) = 0.3"
[1] "estimated beta (standard deviation) = 0.088"
[1] "P(estimated beta > 0) = 1"
Code
plot( x, y, xlim=c(-3,3), ylim=c(-3,3),
      pch=19, col=rgb(0,0,0,alpha=0.3) )
abline( a=0, b=b, lty=2, col='blue' )
abline( a=0, b=b_exp, lty=2, col='red' )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )
1
simulation plot
Figure 1: Outcome simulation
Code
par(mfrow=c(1,2))

plot( b_cand, b_prior, type='l', xlim=c(-1.5,1.5),
      main='Prior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=0, lty=2, col='gray' )

plot( b_cand, b_post, type='l', xlim=c(-1,1),
      main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b, b_exp), lty=2, 
        col=c('gray','blue','red') )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 2: Bayesian inference: grid approximation

3.1.4 Priors and their effects

Prior to observing empirical data, assuming the parameter could take any value within within the range of [-20,+20] with equal probability is not the only prior assumption that can be made. Different levels of uncertainty associated with a parameter can be encoded by different priors. This concept illustrated with Figure 3 through Figure 5, where three different types of priors are used to encode three levels of uncertainty about the parameter \beta.

Code
# grid approximation
Ngp = 1000

post = data.frame( b_cand=seq( from=-20, to=20, length.out=Ngp ) )

ud_func = function(i){ post$b_cand[i]*x }
mu_y = sapply( 1:length(post$b_cand), ud_func )

ud_func = function(i){ prod( dnorm( y, mean=mu_y[,i], sd=1 ) ) }
y_lik = sapply( 1:length(post$b_cand), ud_func )

post$b_prior1 = rep( 1/40, length(post$b_cand) )
post$b_prior2 = dnorm( post$b_cand, mean=0, sd=0.5 )
post$b_prior3 = dnorm( post$b_cand, mean=0.2, sd=0.05 )

nam = c()
for( i in 1:3 ){
  b_prop = y_lik * post[, paste0('b_prior',i) ] 
  
  nam = c(nam, paste0('b_post',i) )
  post = cbind(post, data.frame( b_prop / sum(b_prop) ) )  
}
names(post)[5:7] = nam
1
number of points in candidate list
2
candidate list for parameter
3
user defined function: linear predictor for each candidate
4
calculation of the linear predictor for each candidate
5
user defined function: product of individual observation likelihoods
6
outcome data likelihood
7
prior 1: uniform prior distribution (min=-20, max=+20)
8
prior 2: normal prior distribution (mean=0, sd=0.5)
9
prior 3: normal prior distribution (mean=0.2, sd=0.05)
10
posterior distribution for each prior

First, the distribution depicted in Figure 3 assumes \beta \sim \text{Uniform}(-20, +20) (similar to what is observed in Section 3.1.3). The distribution does not restrain the effect of \beta to be more probable in any range within [-20, +20]. This type of distribution is commonly referred to as a non-informative prior. A non-informative prior reflects reflects the distributional commitment of a parameter to a wide range of values within a specific parameter space (Everitt & Skrondal, 2010).

Non-informative priors

Prior that reflects the distributional commitment of a parameter to a wide range of values within a specific parameter space (Everitt & Skrondal, 2010).

Code
par(mfrow=c(1,2))

plot( post[, c('b_cand','b_prior1')], type='l',
      xlim=c(-1.5,1.5), main='Prior distribution',
      xlab=expression(beta), ylab='probability' )
abline( v=0, lty=2, col='gray' )

plot( post[, c( 'b_cand','b_post1')], type='l',
      xlim=c(-1,1), main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b), lty=2, col=c('gray','blue') )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 3: Bayesian inference: posterior distributions with non-informative prior distribution.

Second, the distribution described in Figure 4 assumes \beta \sim \text{Normal}(0, 0.5). Consequently, the effect of \beta is more probable within the range [-1,+1], with less probability associated with parameter values outside this range. This is a an example of a weakly-informative prior distribution. Weakly informative priors reflect the distributional commitment of a parameter to a weakly constraint range of values within a realistic parameter space (McElreath, 2020).

Weakly informative priors

Prior that reflects the distributional commitment of a parameter to a weakly constraint range of values within a realistic parameter space (McElreath, 2020).

Code
par(mfrow=c(1,2))

plot( post[, c('b_cand','b_prior2')], type='l',
      xlim=c(-1.5,1.5), main='Prior distribution',
      xlab=expression(beta), ylab='probability' )
abline( v=0, lty=2, col='gray' )

plot( post[, c( 'b_cand','b_post2')], type='l',
      xlim=c(-1,1), main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b), lty=2, col=c('gray','blue') )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 4: Bayesian inference: posterior distributions with weakly-informative prior distribution.

Third, the distribution described in Figure 5 assumes \beta \sim \text{Normal}(0.2, 0.05). As a result, the effect of \beta is more probable within the range [0.1,0.3], with less probability associated with parameter values outside this range. This is an example of an informative prior distribution. Informative priors are distributions that expresses specific and definite information about a parameter (McElreath, 2020).

Informative priors

Prior distributions that that expresses specific and definite information about a parameter (McElreath, 2020).

Code
par(mfrow=c(1,2))

plot( post[, c('b_cand','b_prior3')], type='l',
      xlim=c(-1.5,1.5), main='Prior distribution',
      xlab=expression(beta), ylab='probability' )
abline( v=0, lty=2, col='gray' )

plot( post[, c( 'b_cand','b_post3')], type='l',
      xlim=c(-1,1), main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b), lty=2, col=c('gray','blue') )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 5: Bayesian inference: posterior distributions with informative prior distributions.

Lastly, regarding the influence of different priors on the posterior distributions, Figure 3 and Figure 4 reveals that non-informative and weakly-informative priors have a negligible influence on the posterior distribution. Both priors result in similar posteriors. Furthermore, the figure shows the data sample size n=100 is still not enough to provide an unbiased and precise estimation of the true effect. In contrast, Figure 5 shows that, informative priors can have a meaningful influence in the posterior distribution. In this particular case, the prior helps to estimate an unbiased and more precise effect. This results shows that when the data sample size is not sufficiently large, the prior assumptions can play a significant role on obtaining appropriate parameter estimates.

3.1.5 What are Hyperpriors?

In cases requiring greater modeling flexibility, a more refined representation of the parameters’ priors can be defined in terms of hyperparameters and hyperpriors. Hyperparameters refer to parameters indexing a family of possible prior distributions for the original parameter, while hyperpriors are prior distributions for such hyperparameters (Everitt & Skrondal, 2010).

Hyperparameters

Parameters \theta_{2} that indexes a family of possible prior distributions for another parameter \theta_{1} (Everitt & Skrondal, 2010).

Hyperpriors

Prior distributions for hyperparameters (Everitt & Skrondal, 2010).

A simple example of the use of hyperpriors would be to define the regression model shown in Section 3.1.3 in the following form:

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &\sim \text{Normal}( 0, \text{exp}(v) ) \\ v &\sim \text{Normal}(0, 3) \end{align*} where v define the hyperparameter for the parameter \beta, and its associated distribution define its hyperprior.

However, setting prior distributions through hyperparameters brings its own challenges. One notable challenge pertains to the geometry of the parameter’s sample space. This implies that prior probabilistic representations defined in terms of hyperparameters sometimes exhibit complex sample geometries compared to simple priors 4. The reparametrization of hyperpriors into such simpler sample geometries leads to the notion of non-centered priors. Non-centered priors express a parameter’s prior distribution in terms of a hyperparameter, which in turn are defined by a transformation of the original parameter of interest (Gorinova et al., 2019). By incorporating non-centered priors, researchers can ensure the reliability of certain posterior distributions within Bayesian inference procedures. To illustrate, a straightforward example of a non-centered reparametrization of a prior can be demonstrated as follows:

Non-centered priors

Expression of a parameter’s distribution in terms of an hyperparameter defined by a transformation of the original parameter of interest (Gorinova et al., 2019).

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &= z \cdot \text{exp}(v) \\ v &\sim \text{Normal}(0, 3) \\ z &\sim \text{Normal}( 0, 1 ) \end{align*} where z is a hyperparameter sampled independently from v, and the parameter of interest \beta is obtained as a transformation of the two hyperparameters. Figure 6 illustrates the differences in sampling geometries between a centered and a non-centered parametrization. It is evident that the sampling geometry depicted in the left panel of the figure is narrower than the one depicted in the right panel, and as a result, Bayesian inference procedures can have a harder time sampling from the former than the latter distributions.

Code
n = 5000

v = rnorm( n=n, mean=0, sd=1 )
z = rnorm( n=n, mean=0, sd=1 )

b_cent = rnorm( n=n, mean=0, sd=exp(v) )
b_non = z*exp(v)
1
simulation sample size
2
hyperparameter simulation
3
centered parametrization simulation
4
non-centered parameterization simulation
Code
par( mfrow=c(1,2) )

plot( b_cent, v, pch=19, col=rgb(0,0,0,alpha=0.1),
      xlab=expression(beta), ylab=expression(v),
      main='Centered parametrization' ) 

plot( z, v, pch=19, col=rgb(0,0,0,alpha=0.1),
      xlab=expression(z), ylab=expression(v),
      main='Non-centered parametrization' )

par( mfrow=c(1,1) )
1
plot of centered parametrization
2
plot of non-centered parametrization
Figure 6: Centered and non-centered parameter spaces

3.1.6 Importance

The selection of the Bayesian approach was based on three key properties. Firstly, empirical evidence from prior research demonstrates that Bayesian methods outperform frequentist methods, particularly in handling complex and over-parameterized models (Baker, 1998; Kim & Cohen, 1999). This superiority is evident when dealing with complex models, like the proposed GLLAMM, that are challenging to program or are not viable under frequentist methods (Depaoli, 2014).

Secondly, the approach allows for the incorporation of prior information, ensuring that certain parameters are confined within specified boundaries. This helps mitigate non-convergence or improper parameter estimation issues commonly observed in complex models under frequentist methods (Martin & McDonald, 1975; Seaman & Stamey, 2011). In this study, for example, this property was leveraged to incorporate information about the variances of random effects and constrain them to be positive.

Lastly, the Bayesian approach demonstrates proficiency in handling relatively small sample sizes (Baldwin & Fellingham, 2013; Depaoli, 2014; Lambert, Sutton, Burton, Abrams, & Jones, 2006). In this case, despite the study dealing with 2,263 entropy scores, these were derived from a modest sample size of 32 speakers, from whom the inferences are drawn. Consequently, reliance on the asymptotic properties of frequentist methods may not be warranted in this context, underscoring the pertinence of this property to the current study.

Benefits of Bayesian inference procedures

More suitable to deal with:

  1. Complex or highly-parameterized model
  2. Parameter’s constraints.
  3. Small sample sizes

3.2 A tale of two distributions

3.2.1 The normal distribution

A normal distribution is a type of continuous probability distribution in which a random variable can take on values along the real line \left( y_{i} \in [-\infty, \infty] \right). The distribution is characterized by two independent parameters: the mean \mu and the standard deviation \sigma (Everitt & Skrondal, 2010). Thus, a random variable can take on values that are gathered around a mean \mu, with some values dispersed based on some amount of deviation \sigma, without any restriction. Importantly, by definition of the normal distribution, the location (mean) of the distribution does not influence its spread (deviation).

Figure 7 illustrates how the distribution of an outcome changes with different values of \mu and \sigma. The left panel demonstrate that the distribution of the outcome can shift in terms of its location based on the value of \mu. The right panel shows how the distribution of the outcome can become narrower or wider based on the values of \sigma. It is noteworthy that alterations in the mean \mu of the distribution have no impact on its standard deviation \sigma.

Code
require(rethinking)

mu = c(-1.5, 0, 1.5)
sigma = c(1.5, 1, 0.5) 

par(mfrow=c(1,2))

cp = sapply( 1:length(mu), col.alpha, alpha=0.7) 
for(i in 1:length(mu)){
  if(i==1){
    curve( dnorm(x, mean=mu[i], sd=1),
           from=-3, to=3, ylim=c(0,1.5), lwd=2, col=cp[i], 
           xlab="outcome values", ylab="density")
    abline(v=mu, col='gray', lty=2)
    legend('topleft', col=c(cp,'gray'), lwd=2, bty='n',
           legend=expression( mu[1]==-1.5,
                              mu[2]==0,
                              mu[3]==+1.5,
                              sigma==1) )
  } else{
    curve( dnorm(x, mean=mu[i], sd=1),
           from=-3, to=3, ylim=c(0,1.5), lwd=2, col=cp[i], 
           xlab="", ylab="", add=T )
  }
}


cp = sapply( 1:length(sigma), col.alpha, alpha=0.7)
for(i in 1:length(sigma)){
  if(i==1){
    curve( dnorm(x, mean=0, sd=sigma[i]),
           from=-3, to=3, ylim=c(0,1.5), lwd=2, col=cp[i],
           xlab="outcome values", ylab="density")
    abline(v=0, col='gray', lty=2)
    legend('topleft', col=c(cp,'gray'), lwd=2, bty='n',
           legend=expression( sigma[1]==1.5,
                              sigma[2]==1,
                              sigma[3]==0.5,
                              mu==0) )
  } else{
    curve( dnorm(x, mean=0, sd=sigma[i]), 
           from=-3, to=3, ylim=c(0,1.5), lwd=2, col=cp[i], 
           xlab="", ylab="", add=T )
  }
}

par(mfrow=c(1,1))
1
required package
2
parameter to plot: means and standard deviations
3
plotting normal distribution with different ‘mu’ and ‘sigma=1’
4
plotting normal distribution with ‘mu=0’ and different sigma’s
Figure 7: Normal distribution with different mean and standard deviations

3.2.2 The beta-proportion distribution

A beta-proportion distribution is a type of continuous probability distribution in which a random variable can assume values within the continuous interval between zero and one \left( y_{i} \in [0, 1] \right). The distribution is characterized by two parameters: the mean \mu and the sample size M (Everitt & Skrondal, 2010). This implies that a random variable can take on values restricted within the unit interval, centered around a mean \mu, with some values being more dispersed based on the sample size M. Additionally, two characteristic define the distribution. Firstly, like the random variable, the mean of the distribution can only take values within the unit interval (\mu \in [0,1]). Secondly, the mean and sample size parameters are no longer independent of each other.

Figure 8 illustrates how an outcome with a beta-proportion distribution changes with different values of \mu and M. The figure reveals two prevalent patterns in the distribution: (1) the behavior of the dispersion, as measured by the sample size, depends on the mean of the distribution, and (2) the larger the sample size, the less dispersed the distribution is within the unit interval.

Code
require(rethinking)

mu = c(0.2, 0.5, 0.8)
M = c(2, 5, 20) 

par(mfrow=c(1,2))

cp = sapply( 1:length(mu), col.alpha, alpha=0.7) 
for(i in 1:length(mu)){
  if(i==1){
    curve( dbeta2(x, prob=mu[i], theta=10),
           from=0, to=1, ylim=c(0,8), lwd=2, col=cp[i], 
           xlab="outcome values", ylab="density")
    abline(v=mu, col='gray', lty=2)
    legend('topleft', col=c(cp,'gray'), lwd=2, bty='n',
           legend=expression( mu[1]==0.2,
                              mu[2]==0.5,
                              mu[3]==0.8,
                              M==10) )
  } else{
    curve( dbeta2(x, prob=mu[i], theta=10),
           from=0, to=1, ylim=c(0,8), lwd=2, col=cp[i], 
           xlab="", ylab="", add=T )
  }
}


cp = sapply( 1:length(M), col.alpha, alpha=0.7)
for(i in 1:length(M)){
  if(i==1){
    curve( dbeta2(x, prob=0.3, theta=M[i]),
           from=0, to=1, ylim=c(0,8), lwd=2, col=cp[i], 
           xlab="outcome values", ylab="density")
    abline(v=0.3, col='gray', lty=2)
    legend('topleft', col=c(cp,'gray'), lwd=2, bty='n',
           legend=expression( M[1]==2,
                              M[2]==5,
                              M[3]==20,
                              mu==0.3) )
  } else{
    curve( dbeta2(x, prob=0.3, theta=M[i]),
           from=0, to=1, ylim=c(0,8), lwd=2, col=cp[i], 
           xlab="", ylab="", add=T )
  }
}

par(mfrow=c(1,1))
1
required package
2
parameter to plot: means and ‘sample size’
3
plotting beta-proportion distribution with different ‘mu’ and ‘M=10’
4
plotting beta-proportion distribution with ‘mu=0.5’ and different M’s
Figure 8: Beta-proportion distribution with different mean and sample sizes

3.2.3 Importance

It is crucial to comprehend what signifies for an outcome to follow a normal distribution, as the assumption of normally distributed outcomes is ubiquitous in speech intelligibility research (see Boonen, Kloots, Nurzia, & Gillis, 2021; Flipsen, 2006; Lagerberg, Asberg, Hartelius, & Persson, 2014).

In contrast, the significance of the beta-proportion distribution lies in providing a suitable alternative for modeling non-normally bounded distributed outcomes, such as the entropy scores utilized in this study. Boundedness refers to the restriction of data values within specific bounds or intervals, beyond which they cannot occur (Lebl, 2022). Neglecting the bounded nature of an outcome can lead, at best, to underfitting, and, at worse, to misspecification. Underfitting occurs when statistical models fail to capture the underlying data patterns, potentially causing the generation of predictions outside the data range, hindering the model’s inability to generalize its results when confronted with new data. Conversely, misspecification, marked by a poor representation of relevant aspects of the true data in the model’s functional form or covariates inclusion, can lead to inconsistent and inefficient parameters estimates (Everitt & Skrondal, 2010).

Boundedness

Refers to the restriction of data values within specific bounds or intervals, beyond which they cannot occur (Lebl, 2022)

Underfitting

Occurs when statistical models fail to capture the underlying data patterns, potentially causing the generation of predictions outside the data range, hindering the model’s inability to generalize its results when confronted with new data (Everitt & Skrondal, 2010).

Misspecification

Occurs when the model’s functional form or inclusion of covariates poorly represent relevant aspects of the true data. This can lead to inconsistent and inefficient parameters estimates (Everitt & Skrondal, 2010).

3.3 Linear Mixed Models

3.3.1 The ordinary LMM

An ordinary linear mixed model (LMM) is a procedure employed to estimate a linear relationship between the mean of a normally distributed outcome with clustered observations, and one or more covariates (Holmes, Bolin, & Kelley, 2019). A commonly know Bayesian probabilistic representation of an ordinary LMM can be expressed as follows:

Ordinary linear mixed model (LMM)

Procedure employed to estimate a linear relationship between the mean of a normally distributed outcome with clustered observations, and one or more covariates (Holmes et al., 2019).

\begin{align*} y_{ib} &= \beta x_{i} + a_{b} + \varepsilon_{ib} \\ \varepsilon_{ib} &\sim \text{Normal}(0, 1) \\ \beta &\sim \text{Normal}(0, 0.5) \\ a_{b} &\sim \text{Normal}(0, 1) \end{align*}

where y_{ib} denotes the outcome’s i’th observation clustered in block b, and x_{i} denotes the covariate for observation i. Moreover, \beta denote the fixed slope of the regression. Furthermore, a_{b} denotes the random effects, and \varepsilon_{ib} defines the random outcome residuals. Furthermore, the residuals \varepsilon_{ib} are assumed to be normally distributed with mean zero and standard deviation equal to one. Additionally, prior to observing any data, \beta is assumed to be normally distributed with mean zero and standard deviation equal to 0.5. Similarly, a_{b} is assumed to be normally distributed with mean zero and standard deviation equal to one.

3.3.2 The generalized LMM

A generalized linear mixed model (GLMM) are a set of models used to estimate (non)linear relationship between the mean of a (non)normally distributed outcome with clustered observations, and one or more covariates (Lee & Nelder, 1996). Interestingly, the ordinary Bayesian LMM detailed in the previous section can be represented as a special case of GLMM, as follows:

Generalized linear mixed model (GLMM)

Procedure employed to estimate (non)linear relationship between the mean of a (non)normally distributed outcome with clustered observations, and one or more covariates (Lee & Nelder, 1996).

\begin{align*} y_{ib} &\sim \text{Normal}( \mu_{ib}, 1) \\ \mu_{ib} &= \beta x_{i} + a_{b} \\ \beta &\sim \text{Normal}(0, 0.5) \\ a_{b} &\sim \text{Normal}(0, 1) \\ \end{align*}

Notice this representation explicitly highlights the three components of a GLMM: the likelihood component, the linear predictor, and the link function (McElreath, 2020). The likelihood component specifies the assumption about the distribution of an outcome, in this case a normal distribution with mean \mu_{ib} and standard deviation equal to one. The linear predictor specifies the manner in which the covariate will predict the mean of the outcome. In this case the linear predictor is a linear combination of the parameter \beta, the covariate x_{i}, and the random effects a_{b}. The link function specifies the relationship between the mean of the outcome \mu_{ib} and the linear predictor. In this case no transformation is applied to the linear predictor to match its range with the range of the outcome, as both can take on values within the real line (refer to Section 3.2.1). Lastly, resulting from the use of Bayesian procedures, a fourth component can be added to any GLMM: the prior distributions. The priors describe what is known about the parameters \beta and a_{b} before observing any empirical data.

GLMM components

  1. Likelihood component
  2. Linear predictor
  3. Link function

On the other hand, a Beta-proportion LMM is also a GLMM, and it can be represented probabilistically as follows:

\begin{align*} y_{ib} &\sim \text{BetaProp}( \mu_{ib}, 10 ) \\ \mu_{ib} &= \text{logit}^{-1}( \beta x_{i} + a_{b} ) \\ \beta &\sim \text{Normal}(0, 0.5) \\ a_{b} &\sim \text{Normal}(0, 1) \\ \end{align*}

Notice the representation also highlights the three components of a GLMM; however, their assumptions are now slightly different. The likelihood component assumes a beta-proportion distribution for the outcome with mean \mu_{ib} and sample size equal to 10. The linear predictor is still a linear combination of the parameter \beta, the covariate x_{i}, and the random intercepts a_{b}. However, the link function now assumes the mean of the outcome is (non)linearly related to the linear predictor by a inverse-logit function: \text{logit}^{-1}(x) = exp(x) / (1+exp(x)). The inverse-logit function allows the linear predictor to match the range observed in the mean of the beta-proportion distribution \mu_{ib} \in [0,1] (refer to Section 3.2.2). Lastly, the additional fourth component resulting from using Bayesian procedures, the prior assumptions for \beta and a_{b} are also declared.

3.3.3 Importance

Understanding LMM is essential due to the ubiquitous assumption of normally distributed outcomes within the speech intelligibility research field (see Boonen et al., 2021; Flipsen, 2006; Lagerberg et al., 2014). Furthermore, their significance also lies in their ability to model clustered outcomes. Clustering occurs when multiple observations arise from the same individual, location, or time (McElreath, 2020). Accounting for data clustering is essential, as disregarding it may result in biased and inefficient parameter estimates. Consequently, such biases and inefficiencies can diminish statistical power or increase the likelihood of committing a type I error. Statistical power defines the model’s ability to reject the null hypothesis when it is false (Everitt & Skrondal, 2010). Type I error occurs when a null hypothesis is erroneously rejected (Everitt & Skrondal, 2010).

Clustering

Occurs when multiple observations arise from the same individual, location, or time (McElreath, 2020).

Statistical power

The model’s ability to reject the null hypothesis when it is false (Everitt & Skrondal, 2010).

Type I error

The error that results when a null hypothesis is erroneously rejected (Everitt & Skrondal, 2010).

Moreover, the significance of GLMM lies in offering the same benefits as the LMMs, in terms of parameter unbiasedness and efficiency. However, the framework also allows for the modeling of (non)linear relationships of (non)normally distributed outcomes. This is particularly important for modeling bounded data, such as the entropy scores utilized in this study. Refer to Section 3.2.3 to understand the importance of considering the bounded nature of the data in the modeling process.

3.4 Measurement error in an outcome

3.4.1 What is the problem?

Measurement error refers to the disparity between the observed values of a variable, recorded under similar conditions, and some fixed true value which is not directly observable (Everitt & Skrondal, 2010). The problem of measurement error in an outcome is easier to understand with a motivating example. Using a similar model as the one depicted in Section 3.1.3, the probabilistic representation of measurement error in the outcome can be depicted as follows:

Latent variables

It refers to the disparity between the observed values of a variable, recorded under similar conditions, and some fixed true value which is not directly observable (Everitt & Skrondal, 2010).

\begin{align*} \tilde{y}_{i} &\sim \text{Normal}( y_{i}, s ) \\ y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &\sim \text{Uniform}( -20, 20 ) \end{align*} This representation effectively means that a manifest outcome \tilde{y}_{i} is assumed to be normally distributed with a mean equal to the latent outcome y_{i} and a measurement error s. The latent outcome y_{i} is also assumed to be normally distributed but with a mean \mu_{i} and a standard deviation of one. The mean of the latent outcome is considered to be explained by a linear combination of the covariate x_{i} and its expected effect \beta. Lastly, prior to observing any data, \beta is assumed to follow a uniform distribution within the range of [-20, +20], representing a non-informative prior.

For illustrative purposes, a simulated outcome with n=100 observations was generated, assuming \beta=0.2, and a measurement error of s=2. Figure 9 shows the scatter plot of the generated data (see code below). The left panel of the figure demonstrates that the manifest outcome has a larger spread than the latent outcome depicted in the right panel. As a result, although \beta is expected to be estimated in an unbiased manner, the statistical hypothesis tests for the parameter will likely be affected due to this larger variability.

The estimation output confirms the previous hypothesis. The posterior distribution of \beta, estimated using the manifest outcome, has a larger standard deviation than the one estimated using the appropriate latent outcome (see Figure 10 and code output below). Furthermore, the code output shows the parameter’s posterior distribution can no longer reject the null hypothesis at confidence levels of 90\% and 95\%, indicating a reduced statistical power.

Code
set.seed(12345)
n = 100

b = 0.2
x = rnorm( n=n, mean=0, sd=1 )

mu_y = b*x
y = rnorm( n=n, mean=mu_y, sd=1 )

s = 2
y_tilde = rnorm( n=n, mean=y, sd=s )
1
replication seed
2
simulation sample size
3
covariate effect
4
covariate simulation
5
linear predictor on outcome mean
6
latent outcome simulation
7
measurement error
8
manifest outcome simulation
Code
# grid approximation
Ngp = 1000

b_cand = seq( from=-20, to=20, length.out=Ngp )

udf = function(i){ b_cand[i]*x }
mu_y = sapply( 1:length(b_cand), udf )

udf = function(i){ prod( dnorm( y_tilde, mean=mu_y[,i], sd=s ) ) }
y_lik_man = sapply( 1:length(b_cand), udf )

udf = function(i){ prod( dnorm( y, mean=mu_y[,i], sd=1 ) ) }
y_lik_lat = sapply( 1:length(b_cand), udf )

b_prior = rep( 1/40, length(b_cand) )

b_prop_man = y_lik_man * b_prior
b_post_man = b_prop_man / sum(b_prop_man)

b_prop_lat = y_lik_lat * b_prior
b_post_lat = b_prop_lat / sum(b_prop_lat)
1
number of points in candidate list
2
candidate list for parameter
3
user defined function: linear predictor for each candidate
4
calculation of the linear predictor for each candidate
5
user defined function: product of individual observation likelihoods for manifest outcome
6
manifest outcome data likelihood
7
user defined function: product of individual observation likelihoods for latent outcome
8
latent outcome data likelihood
9
uniform prior distribution for parameter, on manifest and latent outcomes
10
proportional posterior distribution for parameter on manifest outcome
11
posterior distribution for parameter on manifest outcome
12
proportional posterior distribution for parameter on latent outcome
13
posterior distribution for parameter on latent outcome
Code
paste0( 'true beta = ', b )

# manifest outcome
b_exp_man = sum( b_cand * b_post_man )
paste0( 'estimated beta (expectation on manifest) = ', 
        round(b_exp_man, 3) )

b_var_man = sqrt( sum( ( (b_cand-b_exp_man)^2 ) * b_post_man ) )
paste0( 'estimated beta (standard deviation on manifest) = ', 
        round(b_var_man, 3) )


# latent outcome
b_exp_lat = sum( b_cand * b_post_lat )
paste0( 'estimated beta (expectation on latent) = ', 
        round(b_exp_lat, 3) )

b_var_lat = sqrt( sum( ( (b_cand-b_exp_lat)^2 ) * b_post_lat ) )
paste0( 'estimated beta (standard deviation on latent) = ', 
        round(b_var_lat, 3) )

# null hypothsis rejection
b_prob_man = sum( b_post_man[ b_cand > 0 ] )
paste0( 'P(estimated beta on manifest > 0) = ', 
        round(b_prob_man, 3) )

b_prob_lat = sum( b_post_lat[ b_cand > 0 ] )
paste0( 'P(estimated beta on latent > 0) = ', 
        round(b_prob_lat, 3) )
1
true values for the parameter
2
expected value for the parameter on manifest outcome
3
standard deviation for the parameter on manifest outcome
4
expected value for the parameter on latent outcome
5
standard deviation for the parameter on latent outcome
6
probability that the parameter is greater than zero, on manifest outcome
7
probability that the parameter is greater than zero, on latent outcome
[1] "true beta = 0.2"
[1] "estimated beta (expectation on manifest) = 0.212"
[1] "estimated beta (standard deviation on manifest) = 0.176"
[1] "estimated beta (expectation on latent) = 0.299"
[1] "estimated beta (standard deviation on latent) = 0.088"
[1] "P(estimated beta on manifest > 0) = 0.887"
[1] "P(estimated beta on latent > 0) = 1"
Code
par( mfrow=c(1,2) )

plot( x, y_tilde, xlim=c(-3,3), ylim=c(-7,7),
      pch=19, col=rgb(0,0,0,alpha=0.3),
      ylab=expression(tilde(y)),
      main='manifest outcome' )
abline( a=0, b=b, lty=2, col='blue')
abline( a=0, b=b_exp_man, lty=2, col='red' )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )

plot( x, y, xlim=c(-3,3), ylim=c(-7,7),
      pch=19, col=rgb(0,0,0,alpha=0.3),
      main='latent outcome' )
abline( a=0, b=b, lty=2, col='blue')
abline( a=0, b=b_exp_lat, lty=2, col='red' )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )

par( mfrow=c(1,1) )
1
simulation plot of manifest outcome
2
simulation plot of latent outcome
Figure 9: Measurement error simulation
Code
par(mfrow=c(1,2))

plot( b_cand, b_post_man, type='l', xlim=c(-0.5,1),
      main='Posterior on manifest outcome',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b, b_exp_man), lty=2, col=c('gray', 'blue', 'red') )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )

plot( b_cand, b_post_lat, type='l', xlim=c(-0.5,1),
      main='Posterior on latent outcome',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b, b_exp_lat), lty=2, col=c('gray', 'blue','red') )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 10: Bayesian inference: grid approximation on measurement error outcomes

3.4.2 How to solve it?

Latent variables can be used to address the problem arising from the larger observed variability in one or more manifest outcomes. A latent variable is a variable that cannot be directly measured but is assumed to be primarily responsible for the variability in one or more manifest variables (Everitt & Skrondal, 2010). Latent variables can be interpreted as hypothetical constructs, traits, or true variables that account for the variability that induce dependence in one or more manifest variables (Rabe-Hesketh, Skrondal, & Pickles, 2004). This concept is akin to a linear mixed model, where the random effects serve to account for the variability that induces dependence within clustered outcomes (Rabe-Hesketh et al., 2004) (refer to Section 3.3). The most widely known examples of latent variable models include Confirmatory Factor Analysis and Structural Equation Models (CFA and SEM, respectively).

Latent variables

Variables that cannot be measured directly but are assumed to be the principal responsible for the common variability in one or more manifest variables (Everitt & Skrondal, 2010).

Commonly, latent variable models consist of two parts: a measurement part and a structural part. In the measurement part, the principles of the Thurstonian model (Luce, 1959; Thurstone, 1927) are employed to aggregate one or more manifest variables and estimate a latent variable. In the structural part, regression-like relationships among latent and other manifest variables are specified, allowing researchers to test hypotheses about their (causal) relationships (Hoyle, 2014). While the measurement part is sometimes of interest in its own right, the substantive model of interest is often defined by the structural part (Rabe-Hesketh et al., 2004).

3.4.3 Importance

It becomes evident that when an outcome is measured with error, the estimation procedures based on standard assumptions yield inefficient parameter estimates. This implies that the parameters are not estimated with sufficient precision. Consequently, such inefficiency can reduce statistical power and increase the likelihood of committing a type II error, which occurs when a null hypothesis is erroneously accepted (Everitt & Skrondal, 2010).

Type II error

The error that results when a null hypothesis is erroneously accepted (Everitt & Skrondal, 2010).

Therefore, the issue of measurement error in an outcome is highly relevant to this study. This research assumes that a speaker’s (latent) potential intelligibility contributes, in part, to the observed variability in the speaker’s (manifest) entropy scores. Given the interest in testing hypotheses about the potential intelligibility of speakers, and considering that the entropy scores are subject to measurement error, it becomes necessary to use latent variables to generate precise parameter estimates to test the hypothesis of interest.

3.5 Distributional departures

3.5.1 Heteroscedasticity

In the context of regression analysis, heteroscedasticity occurs when the variance of an outcome depends on the values of another variable (Everitt & Skrondal, 2010). The opposite case is called homoscedasticity. An example of heteroscedasticity can be probabilistically represented as follows:

Heteroscedasticity

Occurs when the variance (standard deviation) of an outcome depends on the values of another variable. The opposite case is called homoscedasticity (Everitt & Skrondal, 2010).

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, \sigma_{i} ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \sigma_{i} &= exp( \gamma \cdot x_{i} ) \\ \beta &\sim \text{Uniform}( -20, 20 ) \\ \gamma &\sim \text{Uniform}( -20, 20 ) \end{align*} This representation implies that an outcome y_{i} is assumed normally distributed with mean \mu_{i} and a standard deviation \sigma_{i}. Furthermore, the mean and standard deviation of the outcome is explained by the covariate x_{i}, through the parameters \beta and \gamma. Lastly, prior to observing any data, \beta and \gamma are assumed to be uniformly distributed in the range of [-20,+20].

Figure 11 illustrate the presence of heteroscedasticity using the previous representation, assuming a sample size of n=100, and parameters \beta=0.2 and \gamma=1. Notice the variability of the outcome increases as the covariate also increases. Consequently, it is easy to intuit that this difference in the outcome’s variability could have and impact on the statistical hypothesis tests of \beta, and even in the estimate itself. To prove the intuition, an incorrect model is used to estimate \beta.

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &\sim \text{Uniform}( -20, 20 ) \\ \end{align*} As a result, the hypotheses are proven accurate. When an outcome is erroneously assumed homoscedastic, the parameter estimates not only become inefficient but also are not estimated closer to the true value, as seen in the output code below and in Figure 12.

Code
set.seed(12345)
n = 100

b = 0.2
g = 1

x = rnorm( n=n, mean=0, sd=1 )

mu_y = b*x
s_y = exp(g*x) 

y = rnorm( n=n, mean=mu_y, sd=s_y )
1
replication seed
2
simulation sample size
3
beta and gamma effects
4
covariate simulation
5
(non)linear predictor on outcome mean and standard deviation
6
outcome simulation
Code
# grid approximation
Ngp = 1000

b_cand = seq( from=-20, to=20, length.out=Ngp )

udf = function(i){ b_cand[i]*x }
mu_y = sapply( 1:length(b_cand), udf )

udf = function(i){ prod( dnorm( y, mean=mu_y[,i], sd=1 ) ) }
y_lik = sapply( 1:length(b_cand), udf )

b_prior = rep( 1/40, length(b_cand) )

b_prop = y_lik * b_prior
b_post = b_prop / sum(b_prop)
1
number of points in candidate list
2
candidate list for parameter
3
user defined function: linear predictor for each candidate
4
calculation of the linear predictor for each candidate
5
user defined function: product of individual observation likelihoods
6
outcome data likelihood
7
uniform prior distribution for parameter (min=-20, max=20)
8
proportional posterior distribution for parameter
9
posterior distribution for parameter
Code
paste0( 'true beta = ', b )

b_exp = sum( b_cand * b_post )
paste0( 'estimated beta (expectation) = ', round(b_exp, 3) )

b_max = b_cand[ b_post==max(b_post) ]
paste0( 'estimated beta (maximum probability) = ', round(b_max, 3) )

b_var = sqrt( sum( ( (b_cand-b_exp)^2 ) * b_post ) )
paste0( 'estimated beta (standard deviation) = ', round(b_var, 3) )

b_prob = sum( b_post[ b_cand > 0 ] )
paste0( 'P(estimated beta > 0) = ', round(b_prob, 3) )
1
true values for the parameter
2
expected value for the parameter
3
maximum probability value for the parameter
4
standard deviation for the parameter
5
probability that the parameter is greater than zero
[1] "true beta = 0.2"
[1] "estimated beta (expectation) = 0.638"
[1] "estimated beta (maximum probability) = 0.621"
[1] "estimated beta (standard deviation) = 0.088"
[1] "P(estimated beta > 0) = 1"
Code
plot( x, y, xlim=c(-3,3), ylim=c(-6,6),
      pch=19, col=rgb(0,0,0,alpha=0.3) )
abline( a=0, b=b, lty=2, col='blue')
abline( a=0, b=b_exp, lty=2, col='red' )
abline( a=-4, b=-1, lty=2, col=rgb(0,0,0,alpha=0.3))
abline( a=4.4, b=1.5, lty=2, col=rgb(0,0,0,alpha=0.3))
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )
1
scatter plot of an heteroscedastic outcome
Figure 11: Heteroscedasticity simulation
Code
par(mfrow=c(1,2))

plot( b_cand, b_prior, type='l', xlim=c(-1.5,1.5),
      main='Prior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=0, lty=2, col='gray' )

plot( b_cand, b_post, type='l', xlim=c(-1,1),
      main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b, b_exp), lty=2, 
        col=c('gray','blue', 'red') )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,2) )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 12: Bayesian inference: grid approximation

3.5.2 Outliers

In regression analysis, outliers are defined as observations that appear to deviate markedly from other sample data points in which they occur (Everitt & Skrondal, 2010). Although no unique probabilistic representation of outliers can be represented, a simple example can be illustrated with Figure 13. The figure depicts the presence of three influential observations in the outcome (colored blue). It is easier to intuit that with the presence of influential observations the parameter estimates, and the hypothesis test resulting from them, can be affected.

Outlier

Observation that appear to deviate markedly from other sample data points in which it occurs (Everitt & Skrondal, 2010).

The intuition is proven correct when \beta is estimated using the same incorrect model used in Section 3.5.1. When an outcome is erroneously assumed without outliers, the parameter value is estimated farther from the truth, as observed in the code output below and in Figure 14.

\begin{align*} y_{i} &\sim \text{Normal}( \mu_{i}, 1 ) \\ \mu_{i} &= \beta \cdot x_{i} \\ \beta &\sim \text{Uniform}( -20, 20 ) \\ \end{align*}

Code
set.seed(12345)
n = 100

b = 0.2
x = rnorm( n=n, mean=0, sd=1 )

mu_y = b*x
y = rnorm( n=n, mean=mu_y, sd=1 )

idx = which( x>1 )
sel = 1:3
y[idx[sel]] = 6 
1
replication seed
2
simulation sample size
3
beta effects
4
covariate simulation
5
linear predictor on outcome mean
6
outcome simulation
7
outlier simulation
Code
# grid approximation
Ngp = 1000

b_cand = seq( from=-20, to=20, length.out=Ngp )

udf = function(i){ b_cand[i]*x }
mu_y = sapply( 1:length(b_cand), udf )

udf = function(i){ prod( dnorm( y, mean=mu_y[,i], sd=1 ) ) }
y_lik = sapply( 1:length(b_cand), udf )

b_prior = rep( 1/40, length(b_cand) )

b_prop = y_lik * b_prior
b_post = b_prop / sum(b_prop)
1
number of points in candidate list
2
candidate list for parameter
3
user defined function: linear predictor for each candidate
4
calculation of the linear predictor for each candidate
5
user defined function: product of individual observation likelihoods
6
outcome data likelihood
7
uniform prior distribution for parameter (min=-20, max=20)
8
proportional posterior distribution for parameter
9
posterior distribution for parameter
Code
paste0( 'true beta = ', b )

b_exp = sum( b_cand * b_post )
paste0( 'estimated beta (expectation) = ', round(b_exp, 3) )

b_max = b_cand[ b_post==max(b_post) ]
paste0( 'estimated beta (maximum probability) = ', round(b_max, 3) )

b_var = sqrt( sum( ( (b_cand-b_exp)^2 ) * b_post ) )
paste0( 'estimated beta (standard deviation) = ', round(b_var, 3) )

b_prob = sum( b_post[ b_cand > 0 ] )
paste0( 'P(estimated beta > 0) = ', round(b_prob, 3) )
1
true values for the parameter
2
expected value for the parameter
3
maximum probability value for the parameter
4
standard deviation for the parameter
5
probability that the parameter is greater than zero
[1] "true beta = 0.2"
[1] "estimated beta (expectation) = 0.477"
[1] "estimated beta (maximum probability) = 0.46"
[1] "estimated beta (standard deviation) = 0.088"
[1] "P(estimated beta > 0) = 1"
Code
plot( x, y, xlim=c(-3,3), ylim=c(-6,6),
      pch=19, col=rgb(0,0,0,alpha=0.3) )
points( x[idx[sel]], y[idx[sel]],
        pch=19, col=rgb(0,0,1,alpha=0.3) )
abline( a=0, b=b, lty=2, col='blue')
abline( a=0, b=b_exp, lty=2, col='red' )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,3) )
1
scatter plot of an outcome with outliers
Figure 13: Outliers simulation
Code
par(mfrow=c(1,2))

plot( b_cand, b_prior, type='l', xlim=c(-1.5,1.5),
      main='Prior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=0, lty=2, col='gray' )

plot( b_cand, b_post, type='l', xlim=c(-1,1),
      main='Posterior distribution',
      xlab=expression(beta), ylab='probability' ) 
abline( v=c(0, b, b_exp), lty=2, 
        col=c('gray','blue', 'red') )
legend( 'topleft', legend=c('true', 'expected'),
        bty='n', col=c('blue','red'), lty=rep(2,2) )

par(mfrow=c(1,1))
1
prior distribution density plot
2
posterior distribution density plot
Figure 14: Bayesian inference: grid approximation

3.5.3 Solution

As recommended by McElreath (2020), robust models can be used to deal with these types of distributional departures. Robust models are a general class of statistical procedures designed to reduce the sensitivity of the parameter estimates to mild or moderate departures of the data from the model’s assumptions (Everitt & Skrondal, 2010). The procedure consist on modifying the statistical models to include traits that effectively make them robust to small departures from the distributional assumption, like heteroscedastic errors or to the presence of outliers.

Robust models

A general class of statistical procedures designed to reduce the sensitivity of the parameter estimates to mild or moderate failures in the assumption of a model for (Everitt & Skrondal, 2010).

3.5.4 Importance

It is known that dealing with heteroscedasticity and the identification of outlier through preliminary univariate procedures is prone to the erroneous transformation or exclusion of valuable information. This can ultimately bias the parameter estimates, and even make them inefficient (McElreath, 2020). Bias refer to the extent to which the statistical method used in a study does not estimate the quantity thought to be estimated (Everitt & Skrondal, 2010).

Bias

It refer to the extent to which the statistical method used in a study does not estimate the quantity thought to be estimated (Everitt & Skrondal, 2010).

Dealing with the possibility of heteroscedasticity or outlying observations is relevant to the present study, because there is an interest in testing hypotheses about the potential intelligibility of speakers. Therefore, it is a necessity to considering the possibility of using robust regression models to assess these distributional departures and generate unbiased parameter estimates.

4 Introduction

Intelligibility is at the core of successful, felicitous communication. Thus, being able to speak intelligibly is a major achievement in language acquisition and development. Furthermore, intelligibility is considered to be the most practical index to assess competence in oral communication (Kent, Miolo, & Bloedel, 19943). Consequently, it serves as a key indicator for evaluating the effectiveness of various interventions like speech therapy or cochlear implantation (Chin, Bergeson, & Phan, 2012). Speech intelligibility refers to the extent to which a listener can accurately recover the elements in an acoustic signal produced by a speaker, such as phonemes or words (Freeman, Pisoni, Kronenberger, & Castellanos, 2017; van Heuven, 2008; Whitehill & Chau, 2004). Studies that investigate intelligibility have utilized entropy scores to examine differences in children’s intelligibility, particularly between those with normal hearing and those with cochlear implants (Boonen et al., 2021).

Speech intelligibility

The extent to which a listener can accurately recover the elements in an acoustic signal produced by a speaker, such as phonemes or words (Freeman et al., 2017; van Heuven, 2008; Whitehill & Chau, 2004).

However, despite their potential as a fine-grained metric of intelligibility, as proposed by Boonen et al. (2021), they exhibit a statistical complexity that cautions researchers against treating them as straightforward indices of intelligibility. This complexity emerges from the processes of data collection and transcription aggregation, endowing the scores with four distinctive features: boundedness, measurement error, clustering, and the possible presence of outliers and heteroscedasticity. Firstly, entropy scores are confined to an interval between zero and one, a phenomenon known as boundedness (refer to Section 3.2). Secondly, entropy scores are a manifestation of a speaker’s intelligibility, with this intelligibility being the primary factor influencing the observed scores. This issue is commonly referred to as measurement error (refer to Section 3.4). Thirdly, due to the repeated assessment of speakers through multiple speech samples, the scores exhibit clustering (refer to Section 3.3). Lastly, driven by the specific set of speakers and speech samples under scrutiny, these scores often display a potential for the presence of outliers and heteroscedasticity (refer to Section 3.5).

Failure to collectively address these data features can result in numerous statistical challenges that might hamper the researcher’s ability to investigate intelligibility. Notably, neglecting boundedness can, at best, lead to underfitting and, at worst, to misspecification. Underfitting can cause the generation of inconsistent predictions, thus hindering the model’s ability to generalize when confronted with new data. Conversely, misspecification can lead to inconsistent and less efficient parameter estimates (refer to Section 3.2.3). Additionally, overlooking issues such as measurement error, clustering, outliers or heteroscedasticity can lead to biased and less precise parameter estimates, ultimately diminishing the statistical power of models and increasing the likelihood of committing type I or type II errors when addressing research inquiries (refer to Section 3.4.3, Section 3.3.3, and Section 3.5.4).

In the realm of computational statistics and data analysis, several models have been developed to address some of these data features individually and, at times, collectively. All of these models have found moderate adoption in various fields, including speech communication, psychology, education, health care, chemistry, and policy analysis. Specifically, in the domain of speech communication, Boonen et al. (2021) addressed data clustering within the context of intelligibility research. Conversely, de Brito Trindade et al. (2021) and Kangmennaang et al. (2023) concentrated on tackling non-normal bounded data with measurement error in covariates, within the context of chemical reactions and health care access, respectively. Remarkably, despite these individual efforts, there is, to the best of the authors’ knowledge, no study comprehensively addressing all of these data features in a principled way while also transparently and systematically documenting the Bayesian estimation of the resulting statistical models.

5 Research questions

Considering the imperative need to comprehensively address all data features when investigating unobservable and complex traits, this investigation aims to demonstrate the efficacy of the Generalized Linear Latent and Mixed Model (GLLAMM) in handling entropy scores features when exploring research theories concerning speech intelligibility. To achieve this objective, the study will reexamine data originating from transcriptions of spontaneous speech samples, initially collected by Boonen et al. (2021). Subsequently, this data will be aggregated into entropy scores and subjected to modeling through the Bayesian Beta-proportion GLLAMM.

To address the primary objective, the study poses three key research questions. First, given the importance of accurate predictions in developing useful practical models and testing research theories (Shmueli & Koppius, 2011), Research Question 1 (RQ1) assesses whether the Beta-proportion GLLAMM yields more accurate predictions than the more prevalent Normal Linear Mixed Model (LMM) (Holmes et al., 2019). Second, acknowledging that intelligibility is an unobservable, intricate concept and a key indicator of oral communication competence (Kent et al., 19943), Research Question 2 (RQ2) investigates how the proposed model can estimate speakers’ latent intelligibility from manifest entropy scores. Thirdly, recognizing that research involves developing and comparing theories, Research Question 3 (RQ3) illustrates how these research theories can be examined within the model’s framework. Specifically, RQ3 assesses the influence of speaker-related factors on the newly estimated latent intelligibility.

The findings of this study will equip researchers investigating speech intelligibility using entropy scores, or those grappling with similar data challenges, with a statistical tool that improves upon existing research models. The tool will provide an assessment of the predictability of empirical phenomena, along with the capability to develop a quantitative measure for the latent variable of interest. The latter, in turn, could facilitate the appropriate comparison of existing theories related to the latent variable, and even the development of new ones.

6 Data

The data comprised the transcriptions of spontaneous speech samples originally collected by Boonen et al. (2021). The data is not publicly available due to privacy restrictions. Nonetheless, the data can be provided by the corresponding author upon reasonable request.

6.1 Speakers

Boonen et al. (2021) selected 32 speakers, comprising 16 normal hearing children (NH) and 16 hearing-impaired children with cochlear implants (HI/CI). At the time of the collection of the speech samples, the NH group were between 68 and 104 months old (M = 86.3, SD = 9.0), while HI/CI group were between 78 and 98 months old (M = 86.3, SD = 6.7).

6.2 Speech samples

Boonen and colleagues selected speech samples from a large corpus of children’s spontaneously spoken speech recordings. These recordings were obtained as the children narrated a story prompted by the picture book “Frog, Where Are You?” (Mayer, 1969) to a caregiver ‘unfamiliar with the story’. Before recording, the children were allowed to skim over the booklet and examine pictures. Prior to the selection process, the recordings were orthographically transcribed using the CHAT format in the CLAN editor (MacWhinney, 2020). These transcriptions were exclusively used in the curation of appropriate speech samples. To ensure the quality of the selection, Boonen and colleagues excluded sentences containing syntactically ill-formed or incomplete statements, background noise, crosstalk, long hesitations, revisions, or non-words. Finally, ten speech samples were randomly chosen for each of the 32 selected speakers. Each of these samples comprised a single sentence with a length of three to eleven words (M = 7.1, SD = 1.1). The process resulted in a total of 320 selected sentences collectively comprising 2,263 words.

Speech samples

Sentences with a length of three to eleven words (M = 7.1, SD = 1.1).

6.3 Listeners

Boonen and colleagues recruited 105 students from the University of Antwerp. All participants were native speakers of Belgian Dutch and reported no history of hearing difficulties or prior exposure to the speech of hearing-impaired speakers.

6.4 Transcription task

The 320 speech samples and 105 listeners were randomly assigned to five blocks, with each block consisting of approximately 21 listeners who transcribed 64 sentences presented in random order. This resulted in a total of 47,514 transcribed words from the original 2,263 words present in the speech samples. These orthographic transcriptions were automatically aligned with a python script at the sentence level, in a column-like grid structure like the one presented in Table 1. This alignment process was repeated for each sentence within each speaker and block, and the output was manually checked and adjusted (if needed) in order to appropriately align the words. For more details on the random assignment and alignment procedures refer to Boonen et al. (2021).

6.5 Entropy calculation

Next, this study aggregated the aligned transcriptions by listener yielding 2,2634 entropy scores, one score per word. The entropy scores were calculated following Shannon’s formula (1948):

Entropy formula

\begin{equation} H_{wsib} = \frac{ \sum_{k=1}^{K} p_{k} \cdot log_{2}(p_{k}) }{ log_{2}(J)} \end{equation} \tag{4}

where H_{wsib} denotes the entropy scores confined to an interval between zero and one, with w defining the word index, s the sentence index, i the speaker index, and b the block index. Moreover, K describes the number of different word types within transcriptions, and J defines the total number of word transcriptions. Notice that by design, the total number of word transcriptions J corresponds with the number of listeners per block, i.e., 21 listeners. Lastly, p_{k} = \sum_{j=1}^{J} 1(T_{jk}) / J denotes the proportion of word types within transcriptions, with 1(T_{jk}) describing an indicator function that takes the value of one when the word type k is present in the transcription j.

These entropy scores served as the outcome variable, capturing agreement or disagreement among listeners’ word transcriptions. Lower scores indicated a higher degree of agreement between transcriptions and therefore higher intelligibility, while higher scores indicated lower intelligibility, due to a lower degree of agreement in the transcriptions (Boonen et al., 2021; Faes, De Maeyer, & Gillis, 2021). Furthermore, no score is excluded from the modeling process using univariate procedures, rather, the identification of highly influential observations is performed within the context of the proposed models, as recommended by McElreath (2020) (refer to Section 3.5).

Entropy interpretation

Lower scores indicated a higher degree of agreement between transcriptions and therefore higher intelligibility, while higher scores indicated lower intelligibility, due to a lower degree of agreement in the transcriptions (Boonen et al., 2021; Faes et al., 2021)

Table 1: Hypothetical alignment of word transcriptions and entropy scores. Note: Extracted from Boonen et al. (2021), and slightly modified for illustrative purposes. Entropy scores are calculated the first sentence, produced by the first speaker assigned to the first block, and transcribed by five listeners \left( s=1, i=1, b=1, J=5 \right). Transcriptions are in Dutch with English translation below. [B] represent a blank space, and [X] an unidentifiable speech.
Transcription Words
Number 1 2 3 4 5
1 de jongen ziet een kikker
the boy sees a frog
2 de jongen ziet de [X]
the boy sees the [X]
3 de jongen zag [B] kokkin
the boy saw [B] cook
4 de jongen zag geen kikkers
the boy saw no frogs
5 de hond zoekt een [X]
the dog searches a [X]
Entropy 0 0.3109 0.6555 0.8277 1

In this context, it is relevant to exemplify the entropy calculation procedure. For that purpose, the words in position two, four and five observed in Table 1 were used. These words were assumed present in the first sentence, produced by the first speaker assigned to the first block, and transcribed by five listeners (w=\{2,4,5\}, s=1, i=1, b=1, J=5). For the word 2, the first four listeners identified the word type jongen (T_{j1}), while the last identified the word type hond (T_{j2}). Therefore, two word types were identified (K=2), with proportions equal to \{ p_{1}, p_{2} \} = \{ 4/5, 1/5 \} = \{ 0.8, 0.2 \}, and entropy score equal to:

H_{2111} = \frac{ 0.8 \cdot log_{2}(0.8) + 0.2 \cdot log_{2}(0.2) }{ log_{2}(5)} \approx 0.3109 For the word 4, two listeners identified the word type een (T_{j1}), one listener the word type de (T_{j2}), and another the word geen (T_{j3}). A blank space [B] is a symbol that defines the absence of a word in a space where a word is expected, as compared with other transcriptions, during the alignment procedure. Notice that for calculation purposes, because the blank space is not expected in such position, this is considered as a different word type. Consequently four word types were registered (K=4), with proportions equal to \{ p_{1}, p_{2}, p_{3}, p_{4} \} = \{ 2/5, 1/5, 1/5, 1/5 \} = \{ 0.4, 0.2, 0.2, 0.2 \} and entropy score equal to:

H_{4111} = \frac{ 0.4 \cdot log_{2}(0.4) + 3 \cdot 0.2 \cdot log_{2}(0.2) }{ log_{2}(5)} \approx 0.8277 Lastly, for word 5, each listener transcribed a different word. it is important to highlight that when a listener does not identify a complete word, or part of it, (s)he is instructed to write [X] in that position. However, for the calculation of the entropy score, if more than one listener marks an unidentifiable word with [X], each one of them is considered a different word type. This is done to avoid the artificial reduction of the entropy score, as [X] values already indicate the word’s lack of intelligibility. . Consequently, five word types were observed, T_{j1}=kikker, T_{j2}=[X], T_{j3}=kokkin, T_{j4}=kikkers, T_{j5}=[X] (K=5), with proportions equal to \{ p_{1}, p_{2}, p_{3}, p_{4}, p_{5} \} = \{ 1/5, 1/5, 1/5, 1/5, 1/5 \} = \{ 0.2, 0.2, 0.2, 0.2, 0.2 \}, and entropy score equal to:

H_{5111} = \frac{ 5 \cdot 0.2 \cdot log_{2}(0.2) }{ log_{2}(5)} = 1

6.6 Exploring the data

As expected, the data exploration reveals from the start two significant features of the entropy scores: clustering and boundedness (refer to Section 3.2.3 and Section 3.3.3). In the case of the entropy scores, clustering arises due to the presence of various word-level scores generated for numerous sentences, originated from different speakers and evaluated in different blocks (see code output below, depicting the first ten observations of the data). On the other hand, entropy scores exhibit boundedness as they can only take on values within the continuous interval between zero and one, particularly H_{wsib} \in [0,1] (see Figure 15 showing three randomly selected speakers).

Code
var_int = c('bid','cid','uid','wid','HS','A','Am','Hwsib')

head( data_H[, var_int], 10 )
1
selecting variables of interest
2
showing the first 10 observations of the data
   bid cid uid wid HS  A Am      Hwsib
1    1   1   1   1  2 85 17 0.05703883
2    1   1   1   2  2 85 17 0.27857304
3    1   1   1   3  2 85 17 0.27857304
4    1   1   1   4  2 85 17 0.46073935
5    1   1   1   5  2 85 17 0.11344714
6    1   1   1   6  2 85 17 0.00010000
7    1   1   1   7  2 85 17 0.00010000
8    1   1   1   8  2 85 17 0.00010000
9    2   1   2   1  2 85 17 0.06626602
10   2   1   2   2  2 85 17 0.06626602
Code
require(rethinking)

speakers = c(20,8,11, 25,30,6)

par(mfrow=c(2,3))
for( i in speakers ){
  
  speaker = data_H$cid == i
  
  dat = binning( y=data_H$Hwsib[speaker], min_y=0, max_y=1, 
                 n_bins=20, dens=T )
  
  plot(dat, ylab="Frequency-Density", ylim=c(-0.15,max(dat)), xaxt='n',
       xlim=c(-0.05,1.05), xlab='entropy', col=rgb(0,0,0,0.6) )
  abline( h=0, col='gray' )
  abline( v=c(0,1), lty=2, col=rgb(0,0,0,0.3) )
  axis( side=1, at=as.numeric(names(dat)),
        labels=names(dat), las=2, cex.axis=0.8 )
  mtext( text=paste0('Speaker ', i), side=3, adj=0, cex=1.1)
  
}
par(mfrow=c(1,1))
1
package requirement
2
selection of speakers
3
density plot for all sentences of speakerID
Figure 15: Entropy scores distribution: all sentences of selected speakers

Additionally, the data shows the 320 speakers’ speech samples consists of sentences with a minimum of 3 and a maximum of 11 words per sentence (M=7.1, SD=1.1), where most of the speech samples have between 5 and 9 words per sentence (see Figure 16).

Code
speech_samples = with(data_H, table(cid, uid) )
speech_samples
1
report speech samples per speaker and sentence
    uid
cid   1  2  3  4  5  6  7  8  9 10
  1   8  8  8  7  8  7  7  6  6  5
  2   9  7  7  7  6  6  6  6  5  7
  3   8  8  8  8  8  7  8  7  7  6
  4   8  8  8  7  7  7  7  7  6  6
  5   8  9  7  7  7  7  7  7  6  7
  6   8  6  3  6  6  6  6  5  4  4
  7   9  7  7  7  7  8  4  6  6  6
  8   8  8  8  9  7  7  6  6  6  6
  9   9 11  7  6  7  6  6  5  5  6
  10  9  8  8  9  8  8  7  6  6  6
  11 10  9  7  7  7  7  7  7  6  7
  12  9  8  7  8  8  8  6  7  6  6
  13  7  7  7  7  7  7  7  7  6  6
  14  8  8  8  7  8  6  6  6  5  6
  15  8  8  7  7  7  8  7  8  7  6
  16  8  7  7  7  7  6  6  6  6  6
  17  8  7  7  7  7  6  7  7  6  6
  18  8  8  8  7  7  7  6  6  6  6
  19  9  8  7  7  8  8  6  8  6  6
  20 10  8  8  7  7  7  7  6  6  6
  21  9  8  8  8  7  7  7  8  6  6
  22  9  8  8  7  7  7  7  7  8  7
  23  8  8  7  7  7  7  7  7  6  6
  24  7  7  7  7  7  7  7  8  7  7
  25  9  8  8  7  7  8  8  7  6  6
  26  7  7  7  9  7  7  7  7  7  7
  27  9  8  8  7  7  7  7  7  6  6
  28  9  8  8  8  7  7  7  7  6  6
  29  8  9  8  9  7  7  7  6  6  7
  30  9  7  7  7  7  7  6  6  5  5
  31 11  8  8  8  7  7  7  7  7  6
  32  8  8  8 10  7  8  6  6  6  6
Code
speech_samples = data.frame( speech_samples )
hist(speech_samples$Freq, breaks=20, xlim=c(2, 12),
     main='', xlab='words per sentence')
1
histogram of words per sentences
Figure 16: Histogram of words per sentences in the speech samples
Code
psych::describe( speech_samples$Freq )
1
statistical descriptors for the speech samples
   vars   n mean   sd median trimmed  mad min max range skew kurtosis   se
X1    1 320 7.07 1.06      7    7.04 1.48   3  11     8 0.19     1.45 0.06

Moreover, the data comprised 16 normal hearing children (NH, hearing status category 1) and 16 hearing impaired children, with cochlear implant (HI/CI, hearing status category 2). At the time of the collection of the speech samples, the NH group were between 68 and 104 months old (M=86.3, SD=9.0), while HI/CI group were between 78 and 98 months old (M=86.3, SD=6.7).

Code
d_mom = unique( data_H[,c('cid','HS','A')])
with( d_mom, table( A, HS ) )
1
unique hearing status and chronological age per speaker
2
number of speakers per chronological age and hearing status
     HS
A     1 2
  68  0 1
  76  0 1
  78  2 0
  80  1 1
  82  5 0
  83  0 1
  84  0 1
  85  0 6
  86  2 1
  88  1 0
  93  2 2
  94  1 0
  97  1 0
  98  1 0
  104 0 2

Lastly, before fitting the models using Bayesian inference, the data was formatted as a list including all necessary information for the fitting process:

Code
dlist = list(
  
  N = nrow(data_H),
  B = max(data_H$bid),
  I = max(data_H$cid),
  U = max(data_H$uid),
  W = max(data_H$wid),
  
  cHS = max(data_H$HS),
  
  bid = data_H$bid,
  cid = data_H$cid,
  uid = data_H$uid,
  wid = data_H$wid,
  HS = data_H$HS,
  A = data_H$A,
  Am = with( data_H, A - min(A) ),
  Hwsib = data_H$Hwsib
  
)

str(dlist)
1
Number of observations
2
Maximum number of blocks
3
Maximum number of speakers
4
Maximum number of sentences
5
Maximum number of words
6
Maximum number of categories in hearing status
7
Data block ID
8
Data speaker ID
9
Data sentence ID
10
Data word ID
11
Data hearing status
12
Data chronological age
13
Data chronological age (centered)
14
Data entropy score
List of 14
 $ N    : int 2263
 $ B    : int 5
 $ I    : int 32
 $ U    : int 10
 $ W    : num 11
 $ cHS  : int 2
 $ bid  : int [1:2263] 1 1 1 1 1 1 1 1 2 2 ...
 $ cid  : int [1:2263] 1 1 1 1 1 1 1 1 1 1 ...
 $ uid  : int [1:2263] 1 1 1 1 1 1 1 1 2 2 ...
 $ wid  : num [1:2263] 1 2 3 4 5 6 7 8 1 2 ...
 $ HS   : int [1:2263] 2 2 2 2 2 2 2 2 2 2 ...
 $ A    : int [1:2263] 85 85 85 85 85 85 85 85 85 85 ...
 $ Am   : int [1:2263] 17 17 17 17 17 17 17 17 17 17 ...
 $ Hwsib: num [1:2263] 0.057 0.279 0.279 0.461 0.113 ...

7 Methods

This section articulates the probabilistic formalism of both the Normal LMM and the proposed Beta-proportion GLLAMM. Subsequently, it details the set of fitted models and the estimation procedure, along with the criteria employed to assess the quality of the Bayesian inference results. Lastly, the section outlines the methodology employed for model comparison.

7.1 Statistical models

7.1.1 Normal LMM

The general mathematical formalism of the Normal LMM posits that the likelihood of the (manifest) entropy scores H_{wsib} follows a normal distribution, i.e.

\begin{align} H_{wsib} & \sim \text{Normal} \left( \mu_{sib}, \sigma_{i} \right) \end{align} \tag{5}

where \mu_{sib} represents the average entropy at the word-level and \sigma_{i} denotes the standard deviation of the average entropy at the word-level, varying for each speaker. Given the clustered nature of the data, \mu_{sib} is defined by the linear combination of individual characteristics and several random effects:

\begin{align} \mu_{sib} &= \alpha + \alpha_{HS[i]} + \beta_{A, HS[i]} (A_{i} - \bar{A}) + u_{si} + e_{i} + a_{b} \end{align} \tag{6}

where HS_{i} and A_{i} denote the hearing status and chronological age of speaker i, respectively. Additionally, \alpha denote the general intercept, \alpha_{HS[i]} represents the average entropy for each hearing status group, and \beta_{A,HS[i]} denotes the evolution of the average entropy per unit of chronological age A_{i} for each hearing status group. Furthermore, u_{si} denotes the sentence-speaker random effects measuring the unexplained entropy variability within sentences for each speaker, e_{i} denotes the speaker random effects describing the unexplained entropy variability between speakers, and a_{b} denotes the block random effects assessing the unexplained variability between experimental blocks.

Several notably features of the Normal LLM can be discerned from the equations. Firstly, Equation 5 indicates that the variability of the average entropy at the word-level can differ for each speaker, enhancing the model’s robustness to mild or moderate data departures from the normal distribution assumption, such as heteroscedasticity or outliers (refer to Section 3.5). Secondly, Equation 6 reveals that the model assumes no transformation is applied to the relationship between the average entropy and the linear predictor. This is commonly known as a direct link function. Moreover, Equation 6 indicates that chronological age is centered around the minimum chronological age in the sample \bar{A}. The centering procedure is employed to prevent the interpretation of parameters outside the range of chronological ages available in the data (Everitt & Skrondal, 2010). Lastly, the equation implies the model considers separate intercept and separate slope of age for each hearing status group, i.e., NH and HI/CI speakers

Centering

Procedure use to facilitate the interpretation of regression parameters (Everitt & Skrondal, 2010).

7.1.2 Beta-proportion GLLAMM

The general mathematical formalism of the proposed Beta-proportion GLLAMM comprises four components: a response model, with its likelihood, linear predictor, and link function, and a structural model. The response model posits the likelihood of entropy scores follow a beta-proportion distribution,

GLLAMM components

  1. Response model, likelihood
  2. Response model, linear predictor
  3. Response model, link function
  4. Structural equation model

\begin{align} H_{wsib} & \sim \text{BetaProp} \left( \mu_{ib}, M_{i} \right) \end{align} \tag{7}

where\mu_{ib} denotes the average entropy at the word-level and M_{i} signifies the dispersion of the average entropy at the word-level, varying for each speaker. Additionally, \mu_{ib} is defined as,

\begin{align} \mu_{ib} &= \text{logit}^{-1}[ a_{b} - SI_{i} ] \end{align} \tag{8}

where \text{logit}^{-1}(x) = exp(x) / (1+exp(x)) is the inverse-logit link function, a_{b} denotes the block random effects, and SI_{i} describes the speaker’s latent potential intelligibility. Conversely, the structural equation model relates the speakers’ latent potential intelligibility to the individual characteristics:

\begin{align} SI_{i} = \alpha + \alpha_{HS[i]} + \beta_{A, HS[i]} (A_{i} - \bar{A}) + e_{i} + u_{i} \end{align} \tag{9}

where \alpha defines the general intercept, \alpha_{HS[i]} denotes the potential intelligibility for different hearing status groups, and \beta_{A,HS[i]} indicates the evolution of potential intelligibility per unit of chronological age for each hearing status group. Furthermore, e_{i} represents speakers block effects, describing unexplained potential intelligibility variability between speakers, and u_{i} = \sum_{s=1}^{S} u_{si}/S denotes sentence random effects, assessing the average unexplained potential intelligibility variability among sentences within each speaker, with S denoting the total number of sentences per speaker.

Several features are evident in this probabilistic representations. Firstly, akin to the Normal LMM, Equation 7 reveals that the dispersion of average entropy at the word level can differ for each speaker. This enhances the model’s robustness to mild or moderate data departures from the beta-proportion distribution assumption (refer to Section 3.5). Secondly, in contrast with the Normal LMM, Equation 8 shows the potential intelligibility of a speakers has a negative non-linear relationship with the entropy scores, explicitly highlighting the inverse relationship between intelligibility and entropy. This feature also maps the unbounded linear predictor to the bounded limits of the entropy scores. Thirdly, in contrast with the Normal LMM, Equation 9 demonstrates that the structural parameters are interpretable in terms of the latent potential intelligibility scores, where the scale of the latent trait is set by the general intercept \alpha, as required in latent variable models (Depaoli, 2021). Furthermore, the equation implies the model also considers separate intercept and separate slope of age for each hearing status group, i.e., NH and HI/CI speakers. Additionally, Equation 9 indicates that chronological age is centered around the minimum chronological age in the sample \bar{A}. Lastly, the same equation assumes the intelligibility scores have two sources of unexplained variability: e_{i} and u_{i}. The former represents inherent differences in potential intelligibility among different speakers, while the latter assumes that different sentences measure potential intelligibility differently due to variations in word difficulties and their interplay within the sentence.

7.2 Prior distributions

Bayesian procedures require the incorporation of priors (refer to Section 3.1). This study establishes priors and hyperpriors for the parameters of both the Normal LMM and the Beta-proportion GLLAMM using prior predictive simulations. This procedure entails the semi-independent simulation of parameters, which are subsequently transformed into simulated data values according to the models’ specifications. This procedure aims to establish meaningful priors and comprehend its implications within the context of the model before incorporating any information derived from empirical data (McElreath, 2020).

Prior predictive simulations

Procedure that entails the semi-independent simulation of parameters, which are subsequently transformed into simulated data values according to the models’ specifications. The procedure aims to establish meaningful priors and comprehend its implications within the context of the model before incorporating any information derived from empirical data (McElreath, 2020).

7.2.1 Normal LMM

For the parameters of the Normal LMM, non-informative priors and hyperpriors are established to align with analogous model assumptions in frequentist methods (refer to Section 3.1.4). The specified priors are as follows:

7.2.1.1 Standard deviation \sigma_{i}

As described in Section 7.3, the models initially consider one \sigma prior for all the speakers. This choice implies that the presumed uncertainty for the unexplained variability of the average entropy at the word-level is the same for all speakers, prior to the observation of empirical data.

\begin{align} \sigma_{i} &\sim \text{Exponential}\left( 2 \right) \end{align} \tag{10}

The left panel of Figure 17 shows the weakly informative prior expects \sigma to be possible only in a positive range, as it is required for variability parameters (Depaoli, 2021). Furthermore, the right panel of Figure 17 shows that when transformed to the entropy scale, the model expect predictions to fall beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

param_pscale = rexp(n=n, rate=2 )
param_oscale = rnorm( n=n, mean=0.5, sd=param_pscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(0,3),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(sigma))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
parameter scale
4
entropy scale
5
density plot: unrestricted parameter scale
6
density plot: bounded entropy scale
Figure 17: Normal LMM, word-level entropy unexplained variability prior distribution: parameter and entropy scale

Furthermore, as described in Section 7.1.1 and Section 7.3, there is a possibility that the model considers one \sigma_{i} prior for each of the speakers in the data. This choice implies that the presumed uncertainty about unexplained variability of the average entropy at the word-level is similar for each speaker, prior to observing empirical data. In this case the parameters are defined in terms of hyperpriors (refer to Section 3.1.5). \begin{align} r_{S} &\sim \text{Exponential}\left( 2 \right) \\ \sigma_{i} &\sim \text{Exponential}\left( r_{S} \right) \end{align} \tag{11}

The left panel of Figure 18 shows the weakly informative prior expects \sigma_{i} to be possible only in a positive range, as it is required for variability parameters (Depaoli, 2021). The panel also shows the parameters are more likely to happen in the interval of [0, 2.5]. Moreover, the right panel of Figure 18 shows that when the prior is transformed to the entropy scale, the model expect scores to fall beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )

param_pscale = r_s * z_s
param_oscale = rnorm( n=n, mean=0.5, sd=param_pscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(0,3),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(sigma[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
density plot: unrestricted parameter scale
7
density plot: bounded entropy scale
Figure 18: Normal LMM, word-level entropy unexplained variability prior distribution: parameter and entropy scale

7.2.1.2 Intercepts \alpha

This parameter is used in preliminary models where no mathematical formulations regarding how speaker-related factors influence intelligibility are investigated. The prior distribution for \alpha under the Normal LMM is described in Equation 12.

\begin{align} \alpha &\sim \text{Normal} \left( 0, 0.05 \right) \end{align} \tag{12}

The left panel of Figure 19 show the prior is an narrowly concentrated around zero. Moreover, the right panel of Figure 19, demonstrate that when the parameter is transformed to the entropy scale, the model anticipates entropy scores at low levels of the feasible range of the outcome. This implies that particular bias in entropy scores towards lower values are expected by prior.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

param_pscale = rnorm( n=n, mean=0, sd=0.05 )
param_oscale = rnorm(n=n, mean=param_pscale, sd=param_hscale)

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-0.5,0.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(M))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 19: Normal LMM, general intercept prior distribution: parameter and entropy scale

7.2.1.3 Hearing status effects \alpha_{HS[i]}

The prior distribution for the Normal LMM is described in Equation 13. Notably, the same prior is applied to both two hearing status categories. This choice implies that the parameters for each category are presumed to have similar uncertainties prior to the observation of empirical data.

\begin{align} \alpha_{HS[i]} &\sim \text{Normal} \left( 0, 0.2 \right) \end{align} \tag{13}

The left panel of Figure 20 reveal a weakly informative prior that restricts the range of probability of \alpha_{HS[i]} between [0.3, 0.7]. This implies that no particular bias towards entropy values above or below 0.5 for different hearing status groups is present in the priors. However, the right panel of Figure 20 demonstrate that when the prior is transformed to the entropy scale, the model anticipates a concentration of data around low levels of entropy, but also beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

param_pscale = rnorm( n=n, mean=0, sd=0.2 )
param_oscale = rnorm( n=n, mean=param_pscale, sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1,1),
      show.HPDI=0.95,
      main='Parameter scale',
      xlab=expression(alpha[HS]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 20: Normal LMM, hearing status effects prior distribution: parameter and entropy scale

7.2.1.4 Chronological age per hearing status \beta_{A,HS[i]}

The prior distribution for the Normal LMM is described in Equation 23. Notably, the same prior is applied to both two hearing status categories. This choice implies that the evolution of entropy attributed to chronological age between the categories is presumed to have similar uncertainties prior to the observation of empirical data.

\begin{align} \beta_{A,HS[i]} &\sim \text{Normal} \left( 0, 0.1 \right) \end{align} \tag{14}

The left panel of Figure 21 shows the prior restricts \beta_{A,HS[i]} to be mostly within the range of [-0.4, 0.4]. This implies that there is no particular bias towards a positive or negative evolution of entropy scores due to chronological age per hearing status group. However, the right panel of Figure 21 show that when this prior is transformed to the entropy scale, the model anticipate a concentration of entropy values at lower levels, but it also expects entropy scores significantly beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

param_pscale = rnorm( n=n, mean=0, sd=0.1 )
usd = function(i){ param_pscale * data_H$Am[i] }
param_mscale = sapply( 1:length(data_H$Am), usd )
param_oscale = rnorm( n=n, mean=param_pscale, sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-0.5,0.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(beta[AHS]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
user defined function
6
parameter scale
7
entropy scale
8
unrestricted parameter scale
9
bounded entropy scale
Figure 21: Normal LMM, chonological age per hearing status effects prior distribution: parameter and entropy scale

7.2.1.5 speaker differences e_{i}

The prior distribution of e_{i} for the Normal LMM is described in Equation 15. The same prior is assigned to each speaker in the sample. This choice implies that differences in entropy scores between speakers are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of common parameters. In this case the parameters are defined in terms of hyperpriors (refer to Section 3.1.5).

\begin{align} m_{i} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{i} &\sim \text{Exponential} \left( 2 \right) \\ e_{i} &\sim \text{Normal} \left( m_{i}, s_{i} \right) \end{align} \tag{15}

The left panel of Figure 22 shows the prior anticipates differences in entropy scores between speakers as large 3 units of entropy. However, the right panel of Figure 22 demonstrate that when transformed to the entropy scale the model anticipates a concentration scores around low levels, but also it expects the differences to go way beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

m_i = rnorm(n=n, mean=0, sd=0.05 )
s_i = rexp(n=n, rate=2 )
param_pscale = rnorm( n=n, mean=m_i, sd=s_i )

param_oscale = rnorm( n=n, mean=param_pscale, sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1.5,1.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(e[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 22: Normal LMM, speaker differences prior distribution: parameter and entropy scale

7.2.1.6 Within sentence-speaker differences u_{si}

The prior distribution of u_{si} for the Normal LMM is described in Equation 16. The same prior is assigned to each sentence within each speakers in the sample. This choice implies that the average entropy score differences among sentences within speakers are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of common parameters (refer to Section 3.1.5).

\begin{align} m_{u} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{u} &\sim \text{Exponential} \left( 2 \right) \\ u_{si} &\sim \text{Normal} \left( m_{u}, s_{u} \right) \\ \end{align} \tag{16}

The left panel of Figure 23 shows the prior restricts the average differences in entropy among sentences within speakers can be as large as 3 units of measurement. Furthermore, the right panel of Figure 23 demonstrate that when transformed to the entropy scale the model anticipates a concentration of scores around mid-levels of entropy. More importantly, the model expects the differences to go beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

m_u = rnorm(n=n, mean=0, sd=0.05 )
s_u = rexp(n=n, rate=2 )
param_pscale = rnorm( n=n, mean=m_u, sd=s_u )

param_oscale = rnorm( n=n, mean=param_pscale, sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1.5,1.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(u[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 23: Normal LMM, within sentence-speaker average differences prior distribution: parameter and entropy scale

7.2.1.7 Random block effect a_{b}

The prior distribution for the Normal LMM is described in Equation 17. The same prior is assigned to each block. This choice implies that the average entropy score differences among blocks are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of hyperpriors (refer to Section 3.1.5).

\begin{align} m_{b} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{b} &\sim \text{Exponential} \left( 2 \right) \\ a_{b} &\sim \text{Normal} \left( m_{b}, s_{b} \right) \end{align} \tag{17}

The left panel of Figure 24 shows a prior with no particular bias towards differences between blocks above or below zero units of entropy. Nevertheless, the right panel of Figure 24 demonstrate that when the prior is transformed to the entropy scale, the model anticipates a concentration of data around lower levels of entropy, but also contemplates differences beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

m_b = rnorm(n=n, mean=0, sd=0.05 )
s_b = rexp(n=n, rate=2 )
param_pscale = rnorm( n=n, mean=m_b, sd=s_b )

param_oscale = rnorm( n=n, mean=param_pscale, sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1.5,1.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(u[si]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 24: Normal LMM, block differences prior distribution: parameter and entropy scale

7.2.1.8 Linear predictor g(\cdot)

After the careful assessment of the prior implications for each parameter, the expected prior distribution for the linear predcitor can be constructed for the Normal LMM. The prior predictive simulation can be described as in Equation 18:

\begin{align} m &\sim \text{Normal} \left( 0, 0.05 \right) \\ s &\sim \text{Exponential} \left( 2 \right) \\ e_{i} &\sim \text{Normal} \left( m, s \right) \\ u_{si} &\sim \text{Normal} \left( m, s \right) \\ a_{b} &\sim \text{Normal} \left( m, s \right) \\ \alpha_{HS[i]} &\sim \text{Normal} \left( 0, 0.2 \right) \\ \beta_{A,HS[i]} &\sim \text{Normal} \left( 0, 0.1 \right) \\ g(\cdot) &= \alpha_{HS[i]} + \beta_{A, HS[i]} (A_{i} - \bar{A}) + e_{i} + u_{si} + a_{b} \\ \end{align} \tag{18}

The left panel of Figure 34 shows the prior expects speakers’ potential intelligibility scores to be more probable between [-2.5, 2.5], implying there is particular bias towards negative entropy scores is present jointly in these priors. Furthermore, the right panel of Figure 34, demonstrate that when transformed to the entropy scale, the model anticipates prediction of entropy scores within its feasible range, but somewhat more probable in the extremes of entropy.

Code
require(rethinking)

n = 1000

r_s = rexp(n=n, rate=2 )
z_s = rexp(n=n, rate=1 )
param_hscale = r_s * z_s

m = rnorm(n=n, mean=0, sd=0.05 )
s = rexp(n=n, rate=2 )
e_i = rnorm( n=n, mean=m, sd=s )
u_si = rnorm( n=n, mean=m, sd=s )
a_b = rnorm( n=n, mean=m, sd=s )

aHS = rnorm( n=n, mean=0, sd=0.2 )
bAHS = rnorm( n=n, mean=0, sd=0.1 ) 
param_pscale = aHS + bAHS + u_si + e_i + a_b

param_oscale = rnorm( n=n, mean=param_pscale,
                      sd=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-3,3),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(SI[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 25: Normal LMM, linear predictor distribution: parameter and entropy scale

7.2.2 Beta-proportion GLLAMM

For the parameters of the Beta-proportion GLLAMM, weakly informative priors and hyperpriors are established (refer to Section 3.1.4). The specified priors are as follows:

7.2.2.1 Sample size M_{i}

Similar to the Normal LMM, Section 7.3 describes a Beta-proportion GLLAMM that initially considers one M for all speakers in the data. This choice implies that the presumed uncertainty for the unexplained variability of the average entropy at the word-level is the same for all speakers, prior to the observation of empirical data.

\begin{align} M &\sim \text{Exponential}\left( 0.4 \right) \end{align} \tag{19}

The left and right panel of Figure 26, demonstrate the prior of M expects the parameters to be more probable in a positive range between [0, 7], while predicting scores within the boundaries of the data. This implies that no particular bias is present in the word-level entropy unexplained variability, only that it is positive, as expected for measures of variability.

Code
require(rethinking)

n = 1000

param_pscale = rexp( n=n, rate=0.4 )
param_oscale = rbeta2( n=n, prob=0.5, theta=param_pscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(0,10),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(M))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
parameter scale
4
entropy scale
5
density plot: restricted parameter scale
6
density plot: bounded entropy scale
Figure 26: Beta-proportion GLLAMM, word-level entropy unexplained variability prior distribution: parameter and entropy scale

Furthermore, as described in Section 7.1.2 and Section 7.3, there is a possibility that the model considers one M_{i} prior for each speakers in the data. This choice implies the presumed uncertainty for the unexplained dispersion of the average entropy at the word-level is similar for each speaker, prior to the observation of empirical data. In this case the parameters are defined in terms of hyperpriors (refer to Section 3.1.5).

\begin{align} r_{M} &\sim \text{Exponential}\left( 0.2 \right) \\ M_{i} &\sim \text{Exponential}\left( r_{M} \right) \end{align} \tag{20}

The left and right panel of Figure 27, demonstrate the prior of M_{i} expects the parameters to be more probable in a positive range between [0, 20], while at the same time predicting data within the boundaries of the entropy scores. This implies that no particular bias is present in the word-level entropy unexplained variability, only that it is positive, as expected for measures of variability.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )

param_pscale = r_M * z_M
param_oscale = rbeta2( n=n, prob=0.5, theta=param_pscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(0,20),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(M[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
density plot: unrestricted parameter scale
7
density plot: bounded entropy scale
Figure 27: Beta-proportion GLLAMM, word-level entropy unexplained variability prior distribution: parameter and entropy scale

7.2.2.2 Intercepts \alpha

Considering that the structural parameters are now interpretable in terms of the (latent) potential intelligibility scores, the general intercept \alpha is used to set the scale of the latent trait, as it is required in latent variable models (Depaoli, 2021) (refer to Section 3.1.4). The prior distribution for \alpha under the Beta-proportion GLLAMM is described in Equation 21.

\begin{align} \alpha &\sim \text{Normal} \left( 0, 0.05 \right) \end{align} \tag{21}

The left panel of Figure 28 show the prior is narrowly concentrated around zero. Moreover, the right panel of Figure 28, demonstrate that when the parameter is transformed to the entropy scale, the model anticipates entropy scores at mid-levels of the feasible range of the outcome. This implies that no particular bias in entropy scores are expected by prior.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

param_pscale = rnorm( n=n, mean=0, sd=0.05 )
param_oscale = rbeta2( n=n, prob=inv_logit(-1*param_pscale),
                       theta=param_hscale ) 

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-0.5,0.5),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(M))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 28: Beta-proportion GLLAMM, general intercept prior distribution: parameter and entropy scale

7.2.2.3 Hearing status effects \alpha_{HS[i]}

The prior distribution for the Beta-proportion GLLAMM is described in Equation 22. Notably, the same prior is applied to both two hearing status categories. This choice implies that the parameters for each category are presumed to have similar uncertainties prior to the observation of empirical data.

\begin{align} \alpha_{HS[i]} &\sim \text{Normal} \left( 0, 0.3 \right) \end{align} \tag{22}

The right panel of Figure 29, demonstrate that when the \alpha_{HS[i]} prior is transformed to the entropy scale, the model anticipates a concentration of data around mid levels of entropy, and not beyond the feasible range of the outcome. This implies that no particular bias towards specific entropy score values are expected from the using the prior.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

param_pscale = rnorm( n=n, mean=0, sd=0.3 )
param_oscale = rbeta2( n=n, prob=inv_logit(-1*param_pscale),
                       theta=param_hscale ) 

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1,1),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(alpha[HS]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 29: Beta-proportion GLLAMM, hearing status effects prior distribution: parameter and entropy scale

7.2.2.4 Chronological age per hearing status \beta_{A,HS[i]}

The prior distribution for the Beta-proportion GLLAMM is described in Equation 23. Notably, the same prior is applied to both two hearing status categories. This choice implies that the evolution of potential intelligibility attributed to chronological age between the categories is presumed to have similar uncertainties, prior to the observation of empirical data.

\begin{align} \beta_{A,HS[i]} &\sim \text{Normal} \left( 0, 0.1 \right) \end{align} \tag{23}

The left panel of Figure 30 shows the weakly informative prior has no particular bias towards a positive or negative evolution of potential intelligibility due to chronological age per hearing status group. Furthermore, the right panel of Figure 30, demonstrate that when transformed to the entropy scale, the model anticipates a slight concentration of data around mid levels of entropy, but more importantly, it does not expect data beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

param_pscale = rnorm( n=n, mean=0, sd=0.1 )
usd = function(i){ param_pscale * data_H$Am[i] }
param_mscale = sapply( 1:length(data_H$Am), usd )
param_oscale = rbeta2( n=n, prob=inv_logit(-1*param_pscale),
                       theta=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1,1),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(beta[AHS]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
user defined function
6
parameter scale
7
entropy scale
8
unrestricted parameter scale
9
bounded entropy scale
Figure 30: Beta-proportion GLLAMM, chronological age per hearing status effects prior distribution: parameter and entropy scale

7.2.2.5 speaker differences e_{i}

The prior distribution for the Beta-proportion GLLAMM is described in Equation 24. The same prior is assigned to each speakers in the sample. This choice implies that differences in potential intelligibility differences between speakers are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of common parameters, called hyperpriors (refer to Section 3.1.5).

\begin{align} m_{i} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{i} &\sim \text{Exponential} \left( 2 \right) \\ e_{i} &\sim \text{Normal} \left( m_{i}, s_{i} \right) \end{align} \tag{24}

The left panel of Figure 31 shows the prior anticipates differences in intelligibility between speakers as large 3 units of measurement. Furthermore, the right panel of Figure 31, demonstrate that when transformed to the entropy scale, the model anticipates a high concentration around mid-levels of entropy. However, it does not expect data beyond the feasible range of the outcome. This implies that no particular bias towards positive or negative differences in potential intelligibility between speakers are expected resulting from using this prior.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

m_i = rnorm( n=n, mean=0, sd=0.05 )
s_i = rexp(n=n, rate=2 )
param_pscale = rnorm( n=n, mean=m_i, sd=s_i )

param_oscale = rbeta2( n=n, prob=inv_logit(-1*param_pscale),
                       theta=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-2,2),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(e[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 31: Beta-proportion GLLAMM, speakers differences prior distribution: parameter and entropy scale

7.2.2.6 Average within sentence-speaker differences u_{i}

The prior distribution of u_{i} for the Beta-proportion GLLAMM is described in Equation 25. The same prior is assigned to each sentence within each speakers in the sample. This choice implies that the average potential intelligibility differences among sentences within speakers are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of hyperpriors (refer to Section 3.1.5). Next, the within sentence-speaker differences are then aggregated to the speaker level to form the sentence random effects u_{i},

\begin{align} m_{u} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{u} &\sim \text{Exponential} \left( 2 \right) \\ u_{si} &\sim \text{Normal} \left( m_{u}, s_{u} \right) \\ u_{i} &= \sum_{s=1}^{S} \frac{u_{si}}{S} \end{align} \tag{25}

The left panel of Figure 32 shows the prior restricts the average differences in potential intelligibility among sentences within speakers can be as large as 0.8 units of measurement. Furthermore, the right panel of Figure 32, demonstrate that when u_{i} is transformed to the entropy scale, the model anticipates a high concentration of scores around mid-levels of entropy. However, it does not expect data beyond the feasible range of the outcome. This implies that no particular bias towards positive or negative differences in potential intelligibility is expected between speakers.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

m_u = rnorm( n=n, mean=0, sd=0.05 )
s_u = rexp(n=n, rate=2 )
param_pscale = replicate(
  n=n, expr=mean( rnorm( n=10, mean=m_u, sd=s_u ) ) ) 

param_oscale = rbeta2( n=n, prob=inv_logit(-1*param_pscale),
                       theta=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-1,1),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(u[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 32: Beta-proportion GLLAMM, average within sentence-speaker differences prior distribution: parameter and entropy scale

7.2.2.7 Random block effect a_{b}

The prior distribution for the Beta-proportion GLLAMM is described in Equation 26. The same prior is assigned to each block. This choice implies that the average entropy scores differences among blocks are presumed to have similar uncertainties prior to the observation of empirical data, and that these are governed by a set of hyperpriors (refer to Section 3.1.5).

\begin{align} m_{b} &\sim \text{Normal} \left( 0, 0.05 \right) \\ s_{b} &\sim \text{Exponential} \left( 2 \right) \\ a_{b} &\sim \text{Normal} \left( m_{b}, s_{b} \right) \end{align} \tag{26}

The left panel of Figure 33 shows a prior with no particular bias towards positive or negative differences between blocks. Furthermore, the right panel of Figure 33 demonstrate that when transformed to the entropy scale, the model anticipates a high concentration of data around mid levels of entropy, but not beyond the feasible range of the outcome.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

m_b = rnorm( n=n, mean=0, sd=0.05 )
s_b = rexp(n=n, rate=2 )
param_pscale = rnorm( n=n, mean=m_b, sd=s_b )

param_oscale = rbeta2( n=n, prob=inv_logit(param_pscale),
                       theta=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-2,2),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(u[si]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 33: Beta-proportion GLLAMM, block differences prior distribution: parameter and entropy scale

7.2.2.8 Speech intelligibility SI_{i}

After the careful assessment of the prior implications for each parameter, the expected prior distribution for the potential intelligibility can be constructed for the Beta-proportion GLLAMM. The prior predictive simulation can be described as in Equation 27:

\begin{align} \alpha &\sim \text{Normal} \left( 0, 0.05 \right) \\ \alpha_{HS[i]} &\sim \text{Normal} \left( 0, 0.3 \right) \\ \beta_{A,HS[i]} &\sim \text{Normal} \left( 0, 0.1 \right) \\ m &\sim \text{Normal} \left( 0, 0.05 \right) \\ s &\sim \text{Exponential} \left( 2 \right) \\ e_{i} &\sim \text{Normal} \left( m, s \right) \\ u_{si} &\sim \text{Normal} \left( m, s \right) \\ u_{i} &= \sum_{s=1}^{S} \frac{u_{si}}{S} \\ a_{b} &\sim \text{Normal} \left( m, s \right) \\ SI_{si} &= \alpha + \alpha_{HS[i]} + \beta_{A, HS[i]} (A_{i} - \bar{A}) + e_{i} + u_{i} \\ \end{align} \tag{27}

The left panel of Figure 34 shows the prior expects speakers’ potential intelligibility scores to be more probable between [-3, 3], implying no particular bias towards positive or negative potential intelligibility is present jointly in these priors. Furthermore, the right panel of Figure 34, demonstrate that when transformed to the entropy scale, the model anticipates prediction of entropy scores within its feasible range, but somewhat more probable in the extremes of entropy.

Code
require(rethinking)

n = 1000

r_M = rexp(n=n, rate=0.2 )
z_M = rexp(n=n, rate=1 )
param_hscale = r_M * z_M

m = rnorm( n=n, mean=0, sd=0.05 )
s = rexp(n=n, rate=2 )
e_i = rnorm( n=n, mean=m, sd=s )
u_i = replicate( n=n, 
                 exp=mean( rnorm( n=10, mean=m, sd=s ) ) )
a_b = rnorm( n=n, mean=m, sd=s )

a = rnorm( n=n, mean=0, sd=0.05 )
aHS = rnorm( n=n, mean=0, sd=0.3 ) 
bAHS = rnorm( n=n, mean=0, sd=0.1 ) 
param_pscale = a + aHS + bAHS + e_i + u_i 

param_oscale = rbeta2( n=n, prob=inv_logit(a_b-param_pscale),
                       theta=param_hscale )

par(mfrow=c(1,2))

dens( param_pscale, xlim=c(-3,3),
      show.HPDI=0.95,
      main='Parameter scale', 
      xlab=expression(SI[i]))

dens( param_oscale, xlim=c(0,1),
      show.HPDI=0.95,
      main='Entropy scale', 
      xlab=expression(H[wsib]) )
abline( v=c(0, 1), lty=2, col='gray')

par(mfrow=c(1,1))
1
package requirement
2
simulated sample size
3
hyperpriors scale
4
parameter scale
5
entropy scale
6
unrestricted parameter scale
7
bounded entropy scale
Figure 34: Beta GLLAMM, potential intelligibility distribution: parameter and entropy scale

7.3 Fitted models

This study evaluates the comparative predictive capabilities of both the Normal LMM and the Beta-proportion GLLAMM (RQ1) while simultaneously examining various formulations regarding how speaker-related factors influence intelligibility (RQ3). In this context, the predictive capabilities of the models are intricately connected to these formulations. As a result, the study requires fitting 12 different models, each representing a specific manner to investigate one or both research questions. The models comprised six versions of both the Normal LMM and the Beta-proportion GLLAMM. The differences among the models hinged on (1) whether they addressed data clustering in conjunction with measurement error, denoted as the model type, (2) the assumed distribution for the entropy scores, which aimed to handle boundedness, (3) whether the model incorporates a robust feature to address mild or moderate departures of the data from distributional assumptions, and (4) the inclusion or exclusion of speaker-related factors in the models. A detailed overview of the fitted models is available in Table 2.

Table 2: Fitted models.
Model Entropy Robust Fixed effects
Model type distribution feature \beta_{HS[i]} \beta_{A} \beta_{A,HS[i]}
1 LMM Normal No No No No
2 LMM Normal No Yes Yes No
3 LMM Normal No Yes No Yes
4 LMM Normal Yes No No No
5 LMM Normal Yes Yes Yes No
6 LMM Normal Yes Yes No Yes
7 GLLAMM Beta-prop. No No No No
8 GLLAMM Beta-prop. No Yes Yes No
9 GLLAMM Beta-prop. No Yes No Yes
10 GLLAMM Beta-prop. Yes No No No
11 GLLAMM Beta-prop. Yes Yes Yes No
12 GLLAMM Beta-prop. Yes Yes No Yes

The following tabset panel provides the commentated Stan code for all fitted model. Furthermore, the models are implemented using non-centered priors (refer to Section 3.1.5).

```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    // vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    //real<lower=0> r_s;      // global rate for SD
    //vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    //vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = a + 
        // aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    // aHS ~ normal( 0, 0.2 );
    // bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    s_w ~ exponential( 2 );
    //r_s ~ exponential( 2 );
    //z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ normal( mu[n] , s_w );
      // Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model01.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    //real a;               // intercept
    vector[cHS] aHS;      // HS effects
    real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    //real<lower=0> r_s;      // global rate for SD
    //vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    //vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = //a + 
        aHS[ HS[n] ] + 
        bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{

    // fixed effects priors
    //a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0, 0.2 );
    bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    s_w ~ exponential( 2 );
    //r_s ~ exponential( 2 );
    //z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ normal( mu[n] , s_w );
      // Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model02.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    //real a;               // intercept
    vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    //real<lower=0> r_s;      // global rate for SD
    //vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    //vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = //a + 
        aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{
    
    // fixed effects priors
    //a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0, 0.2 );
    // bAm ~ normal( 0 , 0.1 );
    bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    s_w ~ exponential( 2 );
    //r_s ~ exponential( 2 );
    //z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ normal( mu[n] , s_w );
      // Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model03.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    // vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    real<lower=0> r_s;      // global rate for SD
    vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = a + 
        // aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{
    
    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    // aHS ~ normal( 0, 0.2 );
    // bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //s_w ~ exponential( 2 );
    r_s ~ exponential( 2 );
    z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ normal( mu[n] , s_w );
      Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model04.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    //real a;               // intercept
    vector[cHS] aHS;      // HS effects
    real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    real<lower=0> r_s;      // global rate for SD
    vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = //a + 
        aHS[ HS[n] ] + 
        bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{

    // fixed effects priors
    //a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0, 0.2 );
    bAm ~ normal( 0 , 0.1 );
    //bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //s_w ~ exponential( 2 );
    r_s ~ exponential( 2 );
    z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ normal( mu[n] , s_w );
      Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model05.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    //real a;               // intercept
    vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> s_w;      // speaker, utterance, word sd (one overall)
    real<lower=0> r_s;      // global rate for SD
    vector<lower=0>[I] z_s; // non-centered speaker, utterance, word sd (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] s_w;        // speaker, utterance, word sd (one per speaker)
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    s_w = z_s * r_s;      // non-centered speaker, utterance, word sd
    
    // average entropy
    for(n in 1:N){
      mu[n] = //a + 
        aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        bAmHS[ HS[n] ]*Am[n] + 
        b_i[ bid[n] ] +
        e_i[ cid[n] ] +
        u_si[ cid[n], uid[n] ];
    }

}
model{
    
    // fixed effects priors
    //a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0, 0.2 );
    // bAm ~ normal( 0 , 0.1 );
    bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //s_w ~ exponential( 2 );
    r_s ~ exponential( 2 );
    z_s ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ normal( mu[n] , s_w );
      Hwsib[n] ~ normal( mu[n] , s_w[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w );
      log_lik[n] = normal_lpdf( Hwsib[n] | mu[n] , s_w[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model06.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    // vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> Mw;     // 'sample size' parameter
    //real<lower=0> r_M;      // global rate for 'sample size'
    //vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
     
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    //vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        // aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    // aHS ~ normal( 0 , 0.3 );
    // bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    Mw ~ exponential( 0.4 );
    //r_M ~ exponential( 0.2 );
    //z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model07.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    vector[cHS] aHS;      // HS effects
    real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> Mw;     // 'sample size' parameter
    //real<lower=0> r_M;      // global rate for 'sample size'
    //vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
     
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    //vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        aHS[ HS[n] ] + 
        bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0 , 0.3 );
    bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();

    // variability priors
    Mw ~ exponential( 0.4 );
    //r_M ~ exponential( 0.2 );
    //z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model08.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    real<lower=0> Mw;     // 'sample size' parameter
    //real<lower=0> r_M;      // global rate for 'sample size'
    //vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
     
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    //vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    //Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0 , 0.3 );
    // bAm ~ normal( 0 , 0.1 );
    bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    Mw ~ exponential( 0.4 );
    //r_M ~ exponential( 0.2 );
    //z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model09.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    // vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> Mw;    // 'sample size' parameter
    real<lower=0> r_M;      // global rate for 'sample size'
    vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
     
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        // aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    // aHS ~ normal( 0 , 0.3 );
    // bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //Mw ~ exponential( 0.4 );
    r_M ~ exponential( 0.2 );
    z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model10.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    vector[cHS] aHS;      // HS effects
    real bAm;             // Am effects
    // vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> Mw;     // 'sample size' parameter
    real<lower=0> r_M;      // global rate for 'sample size'
    vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        aHS[ HS[n] ] + 
        bAm*Am[n] + 
        // bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0 , 0.3 );
    bAm ~ normal( 0 , 0.1 );
    // bAmHS ~ normal( 0 , 0.1 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //Mw ~ exponential( 0.4 );
    r_M ~ exponential( 0.2 );
    z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model11.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```
```{r}

mcmc_code = "
data{

    // dimensions
    int N;                // number of experimental runs
    int B;                // max. number of blocks
    int I;                // max. number of experimental units (speakers)
    int U;                // max. number of sentences
    int W;                // max. number of words

    // category numbers
    int cHS;              // max. number of categories in HS
    
    // data
    array[N] int<lower=1, upper=B> bid;   // block id
    array[N] int<lower=1, upper=I> cid;   // speaker's id
    array[N] int<lower=1, upper=U> uid;   // sentence's id
    
    array[N] int<lower=1, upper=cHS> HS;  // hearing status
    array[N] real Am;     // chron. age - min( chron. age )
    array[N] real Hwsib;  // replicated entropies
    
}
parameters{
    
    // fixed effects parameters
    real a;               // intercept
    vector[cHS] aHS;      // HS effects
    // real bAm;             // Am effects
    vector[cHS] bAmHS;    // Am effects (per HS)

    // random effects parameters
    real m_b;             // block RE mean
    real<lower=0> s_b;    // block RE sd
    vector[B] z_b;        // non-centered block RE
    real m_i;             // speaker RE mean
    real<lower=0> s_i;    // speaker RE sd
    vector[I] z_i;        // non-centered speaker RE
    real m_u;             // speaker, utterance RE mean
    real<lower=0> s_u;    // speaker, utterance RE sd
    matrix[I,U] z_u;      // non-centered speaker, utterance RE
    
    // variability parameters
    // real<lower=0> Mw;     // 'sample size' parameter
    real<lower=0> r_M;      // global rate for 'sample size'
    vector<lower=0>[I] z_M; // non-centered 'sample size' (one per speaker)
    
}
transformed parameters{
    
    // to track
    vector[B] b_i;        // block random effects
    vector[I] e_i;        // speaker random effects
    matrix[I,U] u_si;     // sentence random effects
    vector[I] u_i;        // sentence average random effects
    vector[I] Mw;         // non-centered 'sample size' (one per speaker)
    vector[I] SI;         // SI index
    vector[N] mu;         // NO TRACK
    
    // random effects
    b_i = m_b + s_b*z_b;  // non-centered block RE
    e_i = m_i + s_i*z_i;  // non-centered speaker RE
    u_si = m_u + s_u*z_u; // non-centered utterance RE
    Mw = z_M * r_M;       // non-centered 'sample size'
    
    // intelligibility and average entropy
    for(i in 1:I){
      u_i[ i ] = mean( u_si[ i, ] );
    }
    
    for(n in 1:N){
      SI[ cid[n] ] = a + 
        aHS[ HS[n] ] + 
        // bAm*Am[n] + 
        bAmHS[ HS[n] ]*Am[n] + 
        e_i[ cid[n] ] +
        u_i[ cid[n] ];
      mu[n] = inv_logit( b_i[ bid[n] ] - SI[ cid[n] ] );
    }

}
model{

    // fixed effects priors
    a ~ normal( 0 , 0.05 );
    aHS ~ normal( 0 , 0.3 );
    // bAm ~ normal( 0 , 0.3 );
    bAmHS ~ normal( 0 , 0.3 );
    
    // random effects priors
    m_b ~ normal( 0 , 0.05 );
    s_b ~ exponential( 2 );
    z_b ~ std_normal();
    m_i ~ normal( 0 , 0.05 );
    s_i ~ exponential( 2 );
    z_i ~ std_normal();
    m_u ~ normal( 0 , 0.05 );
    s_u ~ exponential( 2 );
    to_vector( z_u ) ~ std_normal();
    
    // variability priors
    //Mw ~ exponential( 0.4 );
    r_M ~ exponential( 0.2 );
    z_M ~ exponential( 1 );

    // likelihood
    for(n in 1:N){
      // Hwsib[n] ~ beta_proportion( mu[n] , Mw );
      Hwsib[n] ~ beta_proportion( mu[n] , Mw[ cid[n] ] );
    }
    
}
generated quantities{

    // track
    vector[N] log_lik;
    
    // log-likelihood
    for(n in 1:N){
      // log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw );
      log_lik[n] = beta_proportion_lpdf( Hwsib[n] | mu[n] , Mw[ cid[n] ] );
    }
    
}
"

# saving
model_nam = "model12.stan"
writeLines(mcmc_code, con=file.path(getwd(), 'real_models', model_nam) )

```

Moreover, the following code is provided so the reader can fit all Stan models.

```{r}

for(i in 1:12){

  model_nam = paste0( ifelse(i<10, 'model0', 'model'), i, '.stan')
  model_in = file.path(getwd(), 'real_models')
  model_out = file.path(getwd(), 'real_chain')
  mod = cmdstan_model( file.path(model_in, model_nam) )

  print(model_nam)
  mod$sample( data=dlist,
              output_dir=model_out,
              output_basename = str_replace(model_nam, '.stan', ''),
              num_warmup=2000, num_samples=2000,
              chains=4, parallel_chains=4,
              max_treedepth=20, adapt_delta=0.95) #,init=0

}

            
```

7.4 Estimation

The models were estimated using R version 4.2.2 (R Core Team, 2015) and Stan version 2.26.1 (Stan Development Team., 2021). Four Markov chains were implemented for each parameter, each with distinct starting values. Each chain underwent 4,000 iterations, where the first 2,000 served as a warm-up phase and the remaining 2,000 were considered samples from the posterior distribution.

7.5 Chain quality and information

Verification of stationarity, convergence, and mixing for the parameter chains involved graphical analysis and diagnostic statistics. Graphical analysis utilized trace, trace-rank, and autocorrelation plots (ACF). Diagnostic statistics included the potential scale reduction factor statistics \widehat{\text{R}} with a cut-off value of 1.05 (A. Vehtari, Gelman, Simpson, Carpenter, & Bürkner, 2021a). Furthermore, to confirm whether the parameters posterior distributions were generated with a sufficient number of uncorrelated sampling points, each posterior distribution density plot was inspected along with their effective sample size statistics n_{\text{eff}} (Gelman et al., 2014).

7.6 Model comparison

The study compares the fitted models using three criteria: the deviance information criterion (DIC) by Spiegelhalter et al. (Spiegelhalter, Best, Carlin, & van der Linde, 2002), the widely applicable information criterion (WAIC) by Watanabe (2013), and the Pareto Smoothing Importance Sampling criterion (PSIS) by Vehtari et al. (2017). These criteria score models in terms of deviations from perfect predictive accuracy, with smaller values indicating less deviation (McElreath, 2020). Specifically, DIC measures in-sample deviations, while WAIC and PSIS offer an approximate measure of out-of-sample deviations. Deviations from perfect predictive accuracy serve as the closest estimate for the Kullback-Leibler divergence (Kullback & Leibler, 1951), which measures the degree to which a model accurately represents the true distribution of the data. Moreover, WAIC and PSIS are considered full Bayesian criteria as they incorporate all the information encompassed in the parameter’s posterior distribution. This effectively integrates and reports the inherent uncertainty in the predictive accuracy estimates. Predictive accuracy aside, PSIS offers an additional advantage in identifying highly influential data points. To achieve this, the criterion uses a built-in warning system that flags observations that make out-of-sample predictions unreliable. The key intuition is that observations that are relatively unlikely, according to the model, exert more influence and render predictions more unreliable than those relatively expected (McElreath, 2020).

8 Results

This section presents the results of the Bayesian inference procedures, with particular emphasis in answering the three research questions.

The posterior estimates of the models are loaded in the following manner. file_id() is a user-defined function that identifies the stanfit generated files within a particular directory.

```{r}

# load reference models
for(i in 1:12){
  model_nam = paste0( ifelse(i<10, 'model0', 'model'), i)
  model_out = file.path( save_dir, 'real_chain')
  model_fit = file_id( model_out, model_nam )
  assign( model_nam,
          rstan::read_stan_csv( file.path( model_out, model_fit ) ) )
}

```

8.1 Predictive capabilities of the Beta-proportion GLLAMM compared to the Normal LMM (RQ1)

This research question evaluates the effectiveness of the Beta-proportion GLLAMM in handling the features of entropy scores by comparing its predictive accuracy to the Normal LMM. Models 1, 4, 7, and 10 are specifically chosen for this comparison because their assumptions exclusively address the features of the scores, without integrating additional covariate information. As detailed in Table 2, Model 1 is a Normal LMM that solely addresses data clustering. Building upon this, Model 4 introduces a robust feature. Conversely, Model 7 is a Beta-proportion GLLAMM that deals with boundedness, measurement error and data clustering, and Model 10 extends this model by incorporating a robust feature.

Figure 35 displays values for the DIC, WAIC, and PSIS. They also include the components dWAIC and dPSIS, highlighting the differences in out-of-sample deviations from the best-fitting model and its associated uncertainty. The associated Table 3 and Table 4 provide similar information, while also reporting the pWAIC and pPSIS values, indicating the penalization received by the models for their complexity (roughly associated with their number of parameters). Lastly, the tables show the weight of evidence, which summarizes the relative support for each model.

Overall, all criteria consistently point to Model 10 as the most plausible choice for the data. The model exhibits the lowest values for both WAIC and PSIS, establishing itself as the model with the least deviation from perfect predictive accuracy among those under comparison. Additionally, Figure 35 visually demonstrates the non-overlapping uncertainty (horizontal blue lines) in both dWAIC and dPSIS values for Models 1, 4, and 7 when compared to Model 10. This indicates that Model 10 significantly deviates the least from perfect predictive accuracy when compared to the rest of the models. Lastly, the weight of evidence in Table 3 and Table 4 underscores that 100\% of the evidence aligns with and supports Model 10.

Code
set.seed(12345)

RQ1_WAIC = compare( func=WAIC,
                    model01, model04,
                    model07, model10 )
RQ1_WAIC = cbind( DIC=with(RQ1_WAIC, WAIC-2*pWAIC),
                  RQ1_WAIC )
1
seed for replication
2
comparison of selected models with WAIC
3
DIC calculation
Code
set.seed(12345)

RQ1_PSIS = compare( func=PSIS,
                    model01, model04,
                    model07, model10 )
RQ1_PSIS = cbind( DIC=with(RQ1_PSIS, PSIS-2*pPSIS),
                  RQ1_PSIS )
1
seed for replication
2
comparison of selected models with PSIS
3
DIC calculation
Code
par(mfrow=c(1,2))
plot_compare( compare_obj=RQ1_WAIC,
              ns=1, m='WAIC', dm=T )
plot_compare( compare_obj=RQ1_PSIS, 
              ns=1, m='PSIS', dm=T )
par(mfrow=c(1,1))
1
user defined function: plot of Deviance, WAIC, PSIS, and dWAIC with confidence intervals
Figure 35: WAIC and PSIS model comparison plot. Note: Black and blue points describe point estimates, and continuous horizontal lines indicate the associated uncertainty.
Code
RQ1_WAIC$Model = as.integer( str_sub( rownames(RQ1_WAIC), start=-2) )

kable( round( RQ1_WAIC[,c(8,1:7)], 3), row.names=F, align='crrrrrrr', 
       escape=F, digits=2 )
Table 3: WAIC comparison for selected models
Model DIC WAIC SE dWAIC dSE pWAIC weight
10 -9741.66 -9630.63 276.64 0.00 NA 55.52 1
7 -9649.54 -9586.00 274.50 44.63 17.89 31.77 0
4 -2670.62 -2024.84 127.02 7605.78 263.22 322.89 0
1 -2278.68 -1761.10 101.80 7869.53 266.54 258.79 0
Code
RQ1_PSIS$Model = as.integer( str_sub( rownames(RQ1_PSIS), start=-2) )

kable( round( RQ1_PSIS[,c(8,1:7)], 3), row.names=F, align='crrrrrrr', 
       escape=F, digits=2 )
Table 4: PSIS comparison for selected models
Model DIC PSIS SE dPSIS dSE pPSIS weight
10 -9741.66 -9629.27 276.74 0.00 NA 56.19 1
7 -9649.54 -9585.92 274.56 43.36 17.67 31.81 0
4 -2670.62 -2007.66 128.57 7621.61 263.60 331.48 0
1 -2278.68 -1753.71 102.09 7875.57 266.54 262.48 0

Upon closer examination, the reasons behind the observed disparities in the models become more apparent. Specifically, Figure 36 highlights that the Normal LMM, as outlined in Model 4, fails to capture the underlying data patterns, resulting in predictions that are physically inconsistent, falling outside the outcome’s range between zero and one. Further insight into this issue is provided by Figure 37 and Figure 39. Figure 37 displays Model 4’s score prediction densities which bear no resemblance to the actual data densities. Furthermore, the top two panels in Figure 39 reveal that misspecification in the Normal LMM causes the model to be more surprised by ‘extreme’ entropy scores, leading to their identification as highly unlikely and influential observations. Consequently, the model is rendered unreliable due to the potential biases present in the parameter estimates. In contrast, the Beta-proportion GLLAMM appears to effectively capture the data patterns, generating predictions within the expected data range. This is evident in Figure 36 and complemented by Figure 38 and Figure 39. In Figure 38, Model 10 display prediction densities that bear more resemblance to the actual data densities. Furthermore, the bottom two panels in Figure 39 show the model is less surprised by ‘extreme’ scores, fostering more trust in the model’s estimates.

Code
plot_speaker(d=data_H,
             stanfit_obj1=model04,
             stanfit_obj2=model10,
             p=0.95,
             decreasing=F,
             leg=c('model 04','model 10'))
1
user defined function: plot entropy scores and predictions for selected models
Figure 36: Entropy scores prediction for selected models. Note: Black dots show manifest entropy scores, orange dots and vertical lines show the point estimates and 95% highest probability density interval (HPDI) derived from Model 4, blue dots and vertical lines show similar information for Model 10.
Code
col_string = rep( rethink_palette[2], 2)
pred_speaker_pairs(speakers=c(20,8,11, 25,30,6),
                   d=data_H, stanfit_obj=model04,
                   p=0.95, nbins=20, col_string=col_string)
1
user defined function: entropy and predicted scores density plot for selected model
Figure 37: Model 4: Entropy scores density for selected speakers. Note: Black bars denote the true data density, orange bars describe the predicted data density
Code
col_string = rep( rethink_palette[1], 2)
pred_speaker_pairs(speakers=c(20,8,11, 25,30,6),
                   d=data_H, stanfit_obj=model10,
                   p=0.95, nbins=20, col_string=col_string)
1
user defined function: entropy and predicted scores density plot for selected model
Figure 38: Model 10: Entropy scores density for selected speakers. Note: Black bars denote the true data density, blue bars describe the predicted data density
Code
par(mfrow=c(2,2))
plot_outlier(d=data_H, stanfit_obj=model01)
plot_outlier(d=data_H, stanfit_obj=model04)
plot_outlier(d=data_H, stanfit_obj=model07)
plot_outlier(d=data_H, stanfit_obj=model10)
par(mfrow=c(1,1))
1
user defined function: outliers identification for selected model
Figure 39: Outlier identification and analysis for selected models. Note: Thin and thick vertical discontinuous line indicate threshold of 0.5 and 0.7, respectively. Number pair texts indicate the observation pair of speaker and sentence index.

8.2 Estimation of speakers’ latent potential intelligibility from manifest entropy scores (RQ2)

The second research question aimed to demonstrate the application of the Beta-proportion GLLAMM in estimating the latent potential intelligibility of speakers. This was achieved by employing the general mathematical formalism outlined in Equation 9, along with additional specifications provided in Table 2. The Bayesian procedure successfully estimated the latent potential intelligibility of speakers under Model 10 through the structural equation:

\begin{align} SI_{i} = \alpha + e_{i} + u_{i} \end{align} \tag{28}

Moreover, due to its implementation under Bayesian procedures, Model 10 provides the complete posterior distribution of the speakers’ potential intelligibility scores. This provision, in turn, (1) enables the calculation of summaries, facilitating the ranking of individuals, and (2) supports the assessment of differences among selected speakers. In both cases, the model considers the inherent uncertainty of the estimates resulting from its measurement using multiple entropy scores.

Table 5 and Figure 40 display the ranking of speakers in decreasing order based on point estimates of the latent potential intelligibility. These estimates are accompanied by their associated 95\% highest probability density intervals (HPDI). Both the table and figure clearly indicate that speaker 6 stands out as the least intelligible in the sample, followed farther behind by speaker 1, 17 and 9. In contrast, the figure highlights speaker 20 as the most intelligible, closely followed by speakers 23, 31 and 3. Conversely, Table 6 and Figure 41 show summaries and the full posterior distribution for the comparison of potential intelligibility among selected speakers. The table and figure reveal that only the differences between speakers 6, 1, 17, and 9, along with the difference between speakers 20 and 3 are statistically significant, as their associated 95\% HPDI did not overlap with zero (shaded area).

Code
SI = pred_SI(d=data_H, stanfit_obj=model10, p=0.95)
SI = SI[order(SI$mean, decreasing=T), ]
SI = round(SI, 3)
SI$HPDI = paste0('[', SI$HPDI_lower, ', ', SI$HPDI_upper,']')
1
user-defined function: retrieves SI scores for selected models
Code
kable( SI[,c(1:5,11)], align='rrrrrr', row.names=F, digits=3,
       col.names=c('Speaker ID','Hearing status',
                   'Chron. Age','Chron. Age (centered)',
                   'Posterior mean','95% HPDI') )
Table 5: Latent potential Intelligibility and additional covariates
Speaker ID Hearing status Chron. Age Chron. Age (centered) Posterior mean 95% HPDI
20 1 97 29 1.692 [1.079, 2.294]
23 1 94 26 1.185 [0.595, 1.758]
31 2 104 36 1.016 [0.464, 1.576]
3 1 93 25 0.958 [0.42, 1.489]
22 1 98 30 0.920 [0.377, 1.454]
25 2 93 25 0.919 [0.378, 1.497]
16 1 78 10 0.900 [0.341, 1.498]
21 1 88 20 0.869 [0.322, 1.44]
27 1 93 25 0.776 [0.225, 1.303]
5 1 82 14 0.539 [0.003, 1.031]
24 1 82 14 0.518 [0.004, 1.018]
14 2 85 17 0.397 [-0.14, 0.894]
18 1 86 18 0.256 [-0.229, 0.768]
15 2 85 17 0.225 [-0.283, 0.74]
26 2 85 17 0.201 [-0.302, 0.693]
10 1 82 14 0.047 [-0.424, 0.529]
13 2 85 17 -0.012 [-0.496, 0.47]
4 1 80 12 -0.051 [-0.535, 0.432]
28 2 93 25 -0.060 [-0.552, 0.42]
32 1 78 10 -0.066 [-0.551, 0.397]
7 2 104 36 -0.069 [-0.576, 0.416]
30 2 84 16 -0.131 [-0.6, 0.367]
12 1 82 14 -0.264 [-0.712, 0.189]
2 2 68 0 -0.265 [-0.744, 0.216]
8 1 86 18 -0.341 [-0.834, 0.114]
19 2 86 18 -0.398 [-0.862, 0.075]
11 1 82 14 -0.550 [-1.009, -0.12]
29 2 80 12 -0.558 [-1.032, -0.129]
9 2 83 15 -0.912 [-1.348, -0.489]
17 2 76 8 -0.916 [-1.353, -0.494]
1 2 85 17 -0.920 [-1.352, -0.477]
6 2 85 17 -2.122 [-2.601, -1.652]
Code
plot_SI(d=data_H, stanfit_obj=model10,
        p=0.95, decreasing=T)
1
user defined function: plot ordered potential intelligibility score for speakers
Figure 40: Model 10, latent potential intelligibility of speakers. Note: Black dots and vertical lines show mean point estimates and 95% HPDI intervals.
Code
SI = contrast_SI(d=data_H, stanfit_obj=model10,
                 speakers=c(6,20), p=0.95, raw=T)
SI_contr = SI$SI_contr
SI_contr = round(SI_contr, 3)
SI_contr$names = rownames(SI_contr)
SI_contr$HPDI = paste0('[', SI_contr$HPDI_lower, ', ', SI_contr$HPDI_upper,']')
1
user defined function: produce potential intelligibility contrast among selected speakers
Code
idx_comp = c(1,21,13,52,60,6)
kable( SI_contr[idx_comp,c('names','mean','HPDI')], row.names=F, align='ccc',
       col.names=c('Contrasts','Posterior mean','95% HPDI') )
Table 6: Latent potential intelligibility contrasts of selected speakers
Contrasts Posterior mean 95% HPDI
speaker06-speaker01 -1.202 [-1.626, -0.724]
speaker17-speaker06 1.206 [0.762, 1.656]
speaker09-speaker06 1.211 [0.76, 1.634]
speaker23-speaker20 -0.507 [-1.186, 0.242]
speaker31-speaker20 -0.676 [-1.351, 0.013]
speaker20-speaker03 0.734 [0.067, 1.371]
Code
require(rethinking)

par(mfrow=c(2,3))

for(i in idx_comp){
  dens( SI$SI_raw[[i]], xlim=c(-2.5,2.5),
        col=rgb(0,0,0,0.7), show.HPDI=0.95,
        xlab='Difference in potential intelligibility')
  abline( v=0, lty=2, col=rgb(0,0,0,0.3))
  mtext( text=names(SI$SI_raw)[i], side=3, adj=0, cex=1.1)
}

par(mfrow=c(1,1))
1
density plot for the differences in potential intelligibility between selected speakers
Figure 41: Model 10, potential intelligibility comparisons among selected speakers. Note: Shaded area describes the 95% highest probability density interval (HPDI)

8.3 Testing the influence of speaker-related factors on intelligibility (RQ3)

This research question illustrates how theories on intelligibility can be examined within the model’s framework. Specifically, the focus centers on assessing the influence of speaker-related factors on intelligibility, such as chronological age and hearing status. Notably, despite RQ1 indicating the suitability of Beta-proportion GLLAMM models for entropy scores, existing statistical literature suggests that, in certain scenarios, models incorporating covariate adjustment exhibit robustness to misspecification in the functional form linking an outcome and covariates, commonly referred to as covariate-outcome relationship (Tackney et al., 2023). Consequently, this study compares all models detailed in Table 2. These models are characterized by different covariate adjustments on entropy scores or the latent potential intelligibility of speakers, namely chronological age and hearing status, while potentially exhibiting misspecification in the covariate-outcome relationship, as observed in the case of the Normal LMM.

Similar to RQ1, all criteria consistently identify the Beta-proportion GLLAMM outlined in models 11, 12 and 10 as the most plausible models for the data. The models exhibit the lowest values for both WAIC and PSIS, establishing them as the least deviating models among those under comparison. Moreover, Figure 42 depicts with horizontal blue lines the non-overlapping uncertainty for the models’ dWAIC and dPSIS values. This reveals that, when compared to Model 11, most models exhibit significantly distinct predictive capabilities. Models 12 and 10, however, stand out as exceptions to this pattern. This observation suggests that Models 11, 12, and 10 display the least deviation from perfect predictive accuracy in contrast to the other models. Lastly, the weight of evidence in Table 7 and tbl-rq3-psis, underscores that Model 11 accumulated the greatest support, followed by Model 12, and lastly, by Model 10.

Code
set.seed(12345)

RQ3_WAIC = compare( func=WAIC,
                    model01, model02, model03,
                    model04, model05, model06,
                    model07, model08, model09,
                    model10, model11, model12 )
RQ3_WAIC = cbind( DIC=with(RQ3_WAIC, WAIC-2*pWAIC),
                  RQ3_WAIC )
1
seed for replication
2
comparison of selected models with WAIC
3
DIC calculation
Code
set.seed(12345)

RQ3_PSIS = compare( func=PSIS,
                    model01, model02, model03,
                    model04, model05, model06,
                    model07, model08, model09,
                    model10, model11, model12 )
RQ3_PSIS = cbind( DIC=with(RQ3_PSIS, PSIS-2*pPSIS),
                  RQ3_PSIS )
1
seed for replication
2
comparison of selected models with PSIS
3
DIC calculation
Code
par(mfrow=c(1,2))
plot_compare( compare_obj=RQ3_WAIC,
              ns=1, m='WAIC', dm=T ) 
plot_compare( compare_obj=RQ3_PSIS, 
              ns=1, m='PSIS', dm=T )
par(mfrow=c(1,1))
1
user defined function: plot of Deviance, WAIC, PSIS, and dWAIC with confidence intervals
Figure 42: WAIC and PSIS model comparison plot. Note: Black and blue points describe point estimates, and continuous horizontal lines indicate the associated uncertainty.
Code
RQ3_WAIC$Model = as.integer( str_sub( rownames(RQ3_WAIC), start=-2) )

kable( round( RQ3_WAIC[,c(8,1:7)], 3), row.names=F, align='crrrrrrr', 
       escape=F, digits=2 )
Table 7: WAIC comparison for all models
Model DIC WAIC SE dWAIC dSE pWAIC weight
11 -9741.51 -9632.24 276.80 0.00 NA 54.63 0.46
12 -9741.49 -9631.66 276.82 0.58 1.00 54.91 0.34
10 -9741.66 -9630.63 276.64 1.61 2.97 55.52 0.20
9 -9649.15 -9586.67 274.35 45.56 18.01 31.24 0.00
8 -9649.05 -9586.41 274.33 45.83 18.01 31.32 0.00
7 -9649.54 -9586.00 274.50 46.24 18.19 31.77 0.00
6 -2669.28 -2027.11 126.86 7605.13 263.15 321.08 0.00
4 -2670.62 -2024.84 127.02 7607.40 263.22 322.89 0.00
5 -2669.28 -2024.58 127.06 7607.66 263.24 322.35 0.00
3 -2279.58 -1762.08 101.79 7870.16 266.68 258.75 0.00
1 -2278.68 -1761.10 101.80 7871.14 266.64 258.79 0.00
2 -2279.35 -1760.36 101.86 7871.88 266.69 259.49 0.00
Code
RQ3_PSIS$Model = as.integer( str_sub( rownames(RQ3_PSIS), start=-2) )

kable( round( RQ3_PSIS[,c(8,1:7)], 3), row.names=F, align='crrrrrrr', 
       escape=F, digits=2 )
Table 8: PSIS comparison for all models
Model DIC PSIS SE dPSIS dSE pPSIS weight
11 -9741.51 -9631.16 276.88 0.00 NA 55.17 0.46
12 -9741.49 -9630.70 276.90 0.47 1.01 55.39 0.36
10 -9741.66 -9629.27 276.74 1.89 2.84 56.19 0.18
9 -9649.15 -9586.58 274.41 44.58 17.91 31.28 0.00
8 -9649.05 -9586.33 274.39 44.83 17.91 31.36 0.00
7 -9649.54 -9585.92 274.56 45.24 18.10 31.81 0.00
6 -2669.28 -2009.22 128.46 7621.94 263.52 330.03 0.00
4 -2670.62 -2007.66 128.57 7623.50 263.60 331.48 0.00
5 -2669.28 -2006.49 128.71 7624.67 263.62 331.39 0.00
3 -2279.58 -1754.43 102.07 7876.73 266.68 262.57 0.00
1 -2278.68 -1753.71 102.09 7877.46 266.64 262.48 0.00
2 -2279.35 -1752.86 102.13 7878.30 266.68 263.24 0.00

A closer examination of two models within this comparison set reveal the reasons behind the largest observed disparities. The Normal LMM, as outlined in Model 6, continues to face challenges in capturing underlying data patterns, resulting in predictions that are physically inconsistent, falling outside the outcome’s range. Additionally, the model persists in identifying highly unlikely and influential observations, making it inherently unreliable. In contrast, the Beta-proportion GLLAMM described by Model 12 appears to be less susceptible to ‘extreme’ scores, effectively capturing data patterns within the expected data range and thereby instilling greater confidence in the reliability of the model’s estimates. This contrast is visually depicted in Figure 43, Figure 44, Figure 45, and Figure 46.

Code
plot_speaker(d=data_H,
             stanfit_obj1=model06,
             stanfit_obj2=model12,
             p=0.95,
             decreasing=F,
             leg=c('model 06','model 12'))
Figure 43: Entropy scores prediction for selected models. Note: Black dots show manifest entropy scores, orange dots and vertical lines show the point estimates and 95% highest probability density intervals (HPDI) derived from model 6, blue dots and vertical lines show similar information for model 12.
  1. user defined function: plot entropy scores and two selected models
Code
col_string = rep( rethink_palette[2], 2)
pred_speaker_pairs(speakers=c(20,8,11, 25,30,6),
                   d=data_H, stanfit_obj=model06,
                   p=0.95, nbins=20, col_string=col_string)
1
user defined function: entropy and predicted scores density plot for selected model
Figure 44: Model 6: Entropy scores density for selected speakers. Note: Black bars denote the true data density, orange bars describe the predicted data density
Code
col_string = rep( rethink_palette[1], 2)
pred_speaker_pairs(speakers=c(20,8,11, 25,30,6),
                   d=data_H, stanfit_obj=model12,
                   p=0.95, nbins=20, col_string=col_string)
1
user defined function: entropy and predicted scores density plot for selected model
Figure 45: Model 12: Entropy scores density for selected speakers. Note: Black bars denote the true data density, blue bars describe the predicted data density
Code
par(mfrow=c(2,2))
plot_outlier(d=data_H, stanfit_obj=model05)
plot_outlier(d=data_H, stanfit_obj=model06)
plot_outlier(d=data_H, stanfit_obj=model11)
plot_outlier(d=data_H, stanfit_obj=model12)
par(mfrow=c(1,1))
1
user defined function: outliers identification for selected model
Figure 46: Outlier identification and analysis for selected models. Note: Thin and thick vertical discontinuous line indicate threshold of 0.5 and 0.7, respectively. Number pair texts indicate the observation pair of speaker and sentence index.

Considering the results in Table 7, Table 8, and Figure 42, the model comparisons favor three distinct models: Model 10, 11 and 12. Model 10, supported by 20.4\% of the evidence, estimates a single intercept \alpha and no slope to explain the potential intelligibility of speakers (Table 9). In contrast, supported by 45.1\% of the evidence, Model 11 estimates distinct intercepts for each hearing status group, namely \alpha_{HS[1]} for NH speakers and \alpha_{HS[2]} for the HI/CI counterparts, while maintaining a single slope that gauges the impact of age on potential intelligibility estimates. The 95\% HPDI for the comparison of intercepts \alpha_{HS[2]}-\alpha_{HS[1]} reveal significant differences between NH and HI/CI speakers (Table 10). Lastly, with evidence of 34.1\%, Model 12 estimates one intercept and slope per hearing status group, namely \alpha_{HS[1]} and \beta_{A,HS[1]} for the NH speakers, and \alpha_{HS[2]} and \beta_{A,HS[2]} for the HI/CI counterparts. The 95\% HPDI for the comparison of intercepts and slopes reveal significant differences solely in the slopes between NH and their HI/CI counterparts (\beta_{A,HS[2]}-\beta_{A,HS[1]}, see Table 11).

However, a discerning reader can notice that these models yield conflicting conclusions regarding the influence of chronological age and hearing status on intelligibility. Model 10 implies no influence of chronological age and hearing status on the potential intelligibility of speakers. A visual inspection of Figure 47, however, reveals the reason for the model’s low support. Model 10 fails to capture the prevalent increasing age pattern observed in potential intelligibility estimates. In contrast, Model 11 identifies significant differences in potential intelligibility between NH and HI/CI speakers. The model further suggests that with the progression of chronological age, HI/CI speakers lag behind in intelligibility development, with no opportunity to catch up to their NH counterparts within the analyzed age range, as depicted in Figure 48. Finally, Model 12 indicates no significant differences in intelligibility between NH and HI/CI speakers at 68 months of age (around 6 years old). However, the model reveals distinct evolution patterns of intelligibility per unit of chronological age between different hearing status groups, with HI/CI speakers displaying a slower rate of development compared to their NH counterparts within the analyzed age range. The latter is evident in Figure 49.

Code
par_int = c('a','aHS[1]','aHS[2]','bAm','bAmHS[1]','bAmHS[2]')
model_par = par_recovery(stanfit_obj=model10,
                         p=0.95, est_par=par_int)

vars_int = c('mean','HPDI_lower','HPDI_upper')
model_tab = model_par[,vars_int]
model_tab = round(model_tab, 2)
model_tab$names = c('$\\alpha$')
model_tab$HPDI = paste0('[', model_tab$HPDI_lower, ', ', model_tab$HPDI_upper,']')
1
parameters of interest
2
user defined function: recovers the hearing status and chronological age parameter estimates for selected model
Code
vars_tab = c('names','mean','HPDI')
opts = options(knitr.kable.NA="")
kable( model_tab[,vars_tab], row.names=F, align='ccc', 
       escape=F, digits=3,
       col.names=c('Parameter','Posterior mean','95% HPDI') )
Table 9: Model 10, parameter estimates and 95% highest probability density intervals (HPDI)
Parameter Posterior mean 95% HPDI
\alpha 0.01 [-0.09, 0.1]
Code
pred_intel(d=data_H, stanfit_obj=model10,
           p=0.95, ns=500, seed=12345)
1
user defined function: plot potential intelligibility per age and hearing status for selected model
Figure 47: Model 10, Potential intelligibility per chronological age and hearing status. Note: Colored dots denote mean point estimates, vertical lines describe the 95% highest probability density intervals (HPDI), thick discontinuous line indicate the regression line, thin continuous lines denote regression lines samples from the posterior distribution, and numbers indicate the speaker index.
Code
par_int = c('a','aHS[1]','aHS[2]','bAm','bAmHS[1]','bAmHS[2]')
model_par = par_recovery(stanfit_obj=model11,
                         p=0.95, est_par=par_int)
contrs = contrast_intel(stanfit_obj=model11, p=0.95,
                        rope=c(-0.05,0.05))

vars_int = c('mean','HPDI_lower','HPDI_upper')
model_tab = rbind(model_par[,vars_int], rep(NA,3), rep(NA,3), contrs[,vars_int])
model_tab = round(model_tab, 2)
model_tab$names = c('$\\alpha$','$\\alpha_{HS[1]}$','$\\alpha_{HS[2]}$',
                    '$\\beta_{A}$', NA,
                    'Contrasts','$\\alpha_{HS[2]} - \\alpha_{HS[1]}$')
model_tab$HPDI = paste0('[', model_tab$HPDI_lower, ', ', model_tab$HPDI_upper,']')
model_tab[5, 1:5] = NA
model_tab[6, c(1:3,5)] = NA
1
parameters of interest
2
user defined function: recovers the hearing status and chronological age parameter estimates for selected model
3
user defined function: extracts contrast of interest from selected model
Code
vars_tab = c('names','mean','HPDI')
opts = options(knitr.kable.NA = "")
kable( model_tab[,vars_tab], row.names=F, align='ccc', 
       escape=F, digits=3,
       col.names=c('Parameter','Posterior mean','95% HPDI') )
Table 10: Model 11, parameter estimates and 95% highest probability density intervals (HPDI)
Parameter Posterior mean 95% HPDI
\alpha 0.01 [-0.08, 0.11]
\alpha_{HS[1]} 0.53 [0.11, 0.94]
\alpha_{HS[2]} -0.03 [-0.43, 0.39]
\beta_{A} 0.07 [0.05, 0.1]
Contrasts
\alpha_{HS[2]} - \alpha_{HS[1]} -0.55 [-1, -0.15]
Code
pred_intel(d=data_H, stanfit_obj=model11,
           p=0.95, ns=500, seed=12345)
1
user defined function: plot potential intelligibility per age and hearing status for selected model
Figure 48: Model 11, Potential intelligibility per chronological age and hearing status. Note: Colored dots denote mean point estimates, vertical lines describe the 95% highest probability density intervals (HPDI), thick discontinuous line indicate the regression line, thin continuous lines denote regression lines samples from the posterior distribution, and numbers indicate the speaker index.
Code
par_int = c('a','aHS[1]','aHS[2]','bAm','bAmHS[1]','bAmHS[2]')
model_par = par_recovery(stanfit_obj=model12,
                         p=0.95, est_par=par_int)
contrs = contrast_intel(stanfit_obj=model12, p=0.95,
                        rope=c(-0.05,0.05))

vars_int = c('mean','HPDI_lower','HPDI_upper')
model_tab = rbind(model_par[,vars_int], rep(NA,3), rep(NA,3), contrs[,vars_int])
model_tab = round(model_tab, 2)
model_tab$names = c('$\\alpha$','$\\alpha_{HS[1]}$','$\\alpha_{HS[2]}$',
                    '$\\beta_{A,HS[1]}$','$\\beta_{A,HS[2]}$', NA,
                    'Contrasts','$\\alpha_{HS[2]} - \\alpha_{HS[1]}$',
                    '$\\beta_{A,HS[2]} - \\beta_{A,HS[1]}$')
model_tab$HPDI = paste0('[', model_tab$HPDI_lower, ', ', model_tab$HPDI_upper,']')
model_tab[6, 1:5] = NA
model_tab[7, c(1:3,5)] = NA
1
parameters of interest
2
user defined function: recovers the hearing status and chronological age parameter estimates for selected model
3
user defined function: extracts contrast of interest from selected model
Code
vars_tab = c('names','mean','HPDI')
opts = options(knitr.kable.NA = "")
kable( model_tab[,vars_tab], row.names=F, align='ccc', 
       escape=F, digits=3,
       col.names=c('Parameter','Posterior mean','95% HPDI') )
Table 11: Model 12, parameter estimates and 95% highest probability density intervals (HPDI)
Parameter Posterior mean 95% HPDI
\alpha 0.01 [-0.09, 0.11]
\alpha_{HS[1]} 0.21 [-0.28, 0.72]
\alpha_{HS[2]} 0.23 [-0.24, 0.69]
\beta_{A,HS[1]} 0.10 [0.07, 0.13]
\beta_{A,HS[2]} 0.06 [0.03, 0.09]
Contrasts
\alpha_{HS[2]} - \alpha_{HS[1]} 0.01 [-0.61, 0.74]
\beta_{A,HS[2]} - \beta_{A,HS[1]} -0.04 [-0.08, 0]
Code
pred_intel(d=data_H, stanfit_obj=model12,
           p=0.95, ns=500, seed=12345)
1
user defined function: plot potential intelligibility per age and hearing status for selected model
Figure 49: Model 12, Potential intelligibility per chronological age and hearing status. Note: Colored dots denote mean point estimates, vertical lines describe the 95% highest probability density intervals (HPDI), thick discontinuous line indicate the regression line, thin continuous lines denote regression lines samples from the posterior distribution, and numbers indicate the speaker index.

8.4 Chain quality and information

Given the considerable number of fitted models and the resulting abundance of parameters, this section opted to exclusively showcase the quality and information embedded in the Bayesian chains through models 6 and 12. The selection of these models is grounded in their parameter counts, with both registering the highest among those detailed in Section 7.3. It is crucial to underscore that a meticulous examination of all fitted models was conducted. Notably, all models demonstrated comparable results to those specifically chosen for illustrative purposes.

In general, both graphical analysis and diagnostic statistics indicated that all chains exhibited low to moderate autocorrelation, explored the parameter space in a seemingly random manner, and converged to a constant mean and variance in their post-warm-up phase. Figure 50 visualizes the \widehat{\text{R}} diagnostic statistic and Figure 51 through Figure 63 illustrate the chain’s graphical analysis.

Code
par_int = c( 'a','aHS[1]','aHS[2]',
             'bAm','bAmHS[1]','bAmHS[2]', 
             'm_b','s_b',
             'm_i','s_i',
             'm_u','s_u',
             'r_s','r_M',
             's_w','Mw')

model06_parameters = par_recovery(
  stanfit_obj = model06,
  est_par = par_int,
  p = 0.95 )

model12_parameters = par_recovery(
  stanfit_obj = model12,
  est_par = par_int,
  p = 0.95 )
1
parameters of interest
2
user-defined function: displays concise parameter estimate information for selected model
3
user-defined function: displays concise parameter estimate information for selected model
Code
par( mfrow=c(1,2) )

plot( 1:nrow(model06_parameters), model06_parameters$Rhat4,
      ylim=c(0.95, 1.1), pch=19, col=rgb(0,0,0,alpha=0.3),
      xaxt='n',xlab='', ylab='Rhat',
      main='Normal LMM: model 06')
axis( side=1, at=1:nrow(model06_parameters),
      labels=rownames(model06_parameters),
      cex.axis=0.8, las=2 )
abline( h=1.05, lty=2, col=rgb(0,0,0,0.3) )

plot( 1:nrow(model12_parameters), model12_parameters$Rhat4,
      ylim=c(0.95, 1.1), pch=19, col=rgb(0,0,0,alpha=0.3),
      xaxt='n',xlab='', ylab='Rhat',
      main='Beta-proportion GLLAMM: model 12')
axis( side=1, at=1:nrow(model12_parameters),
      labels=rownames(model12_parameters),
      cex.axis=0.8, las=2 )
abline( h=1.05, lty=2, col=rgb(0,0,0,0.3) )

par( mfrow=c(1,1) )
1
model 06: Rhat values plot
2
model 06: parameters names in x-axis
3
model 06: convergence threshold
4
model 12: Rhat values plot
5
model 12: parameters names in x-axis
6
model 12: convergence threshold
Figure 50: Selected models, Rhat values
Code
tri_plot( stan_object=model06,
          pars=c('aHS[1]','aHS[2]') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 51: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('a','aHS[1]','aHS[2]') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 52: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model06,
          pars=c('bAmHS[1]','bAmHS[2]') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 53: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('bAmHS[1]','bAmHS[2]') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 54: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model06,
          pars=c('m_b','s_b') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 55: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('m_b','s_b') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 56: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model06,
          pars=c('m_i','s_i') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 57: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('m_i','s_i') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 58: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model06,
          pars=c('m_u','s_u') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 59: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('m_u','s_u') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 60: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model06,
          pars=c('r_s', paste0('s_w[', 1:4,']')) )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 61: Model 06, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=c('r_M', paste0('Mw[', 1:4,']')) )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 62: Model 12, trace, trace rank and ACF plots for selected parameters
Code
tri_plot( stan_object=model12,
          pars=paste0('SI[', 1:5, ']') )
1
used-defined function: generation of trace, trace rank, and ACF plots for selected parameters within a model
Figure 63: Model 12, trace, trace rank and ACF plots for selected parameters

Moreover, the density plots and n_{\text{eff}} statistics collectively confirmed that all posterior distributions are unimodal distributions with values centered around a mean, generated with a satisfactory number of uncorrelated sampling points, making substantive sense compared to the models’ prior beliefs. Figure 64 visualizes the n_{\text{eff}} diagnostic statistic and Figure 66 through Figure 71 illustrate the chains’ graphical analysis.

Code
par( mfrow=c(1,2) )

plot( 1:nrow(model06_parameters), model06_parameters$n_eff,
      ylim=c(0, 18000), pch=19, col=rgb(0,0,0,alpha=0.3),
      xaxt='n',xlab='', ylab='Neff',
      main='Normal LMM: model 06')
axis( side=1, at=1:nrow(model06_parameters),
      labels=rownames(model06_parameters),
      cex.axis=0.8, las=2 )
abline( h=seq(0, 18000, by=2000), lty=2, col=rgb(0,0,0,0.3) )

plot( 1:nrow(model12_parameters), model12_parameters$n_eff,
      ylim=c(0, 18000), pch=19, col=rgb(0,0,0,alpha=0.3),
      xaxt='n',xlab='', ylab='Neff',
      main='Beta-proportion GLLAMM: model 12')
axis( side=1, at=1:nrow(model12_parameters),
      labels=rownames(model12_parameters),
      cex.axis=0.8, las=2 )
abline( h=seq(0, 18000, by=2000), lty=2, col=rgb(0,0,0,0.3) )

par( mfrow=c(1,1) )
1
model 06: Neff values plot
2
model 06: parameters names in x-axis
3
model 06: convergence threshold
4
model 12: Neff values plot
5
model 12: parameters names in x-axis
6
model 12: convergence threshold
Figure 64: Selected models, neff values
Code
par_int = c('a','aHS','bAm','bAmHS')
dens_plot(stanfit_obj=model06, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 65: Model 06, density plots for selected parameters
Code
par_int = c('a','aHS','bAm','bAmHS')
dens_plot(stanfit_obj=model12, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 66: Model 12, density plots for selected parameters
Code
par_int = c('m_b','m_i','m_u','s_b','s_i','s_u')
dens_plot(stanfit_obj=model06, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 67: Model 06, density plots for selected parameters
Code
par_int = c('m_b','m_i','m_u','s_b','s_i','s_u')
dens_plot(stanfit_obj=model12, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 68: Model 06, density plots for selected parameters
Code
par_int = c('r_s','s_w')
dens_plot(stanfit_obj=model06, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 69: Model 06, density plots for selected parameters
Code
par_int = c('r_M','Mw')
dens_plot(stanfit_obj=model12, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 70: Model 12, density plots for selected parameters
Code
par_int = 'SI'
dens_plot(stanfit_obj=model12, pars=par_int, p=0.95)
1
parameters of interest
2
used-defined function: generation of density plot with HPDI for selected parameters
Figure 71: Model 12, density plots for selected parameters

9 Discussion

9.1 Findings

This study examined the suitability of the Bayesian Beta-proportion GLLAMM for the quantitative measuring and testing of research theories related to speech intelligibility using entropy scores. The initial findings supported the assertion that Beta-proportion GLLAMMs consistently outperformed Normal LMMs in predicting entropy scores, underscoring its superior predictive performance. The results emphasized that models neglecting the outcomes’ measurement error and boundedness lead to underfitting and misspecification issues, even when robust features are integrated. This is clearly illustrated by the Normal LMMs.

Secondly, the study showcased the Beta-proportion GLLAMM’s proficiency in estimating the latent potential intelligibility of speakers based on manifest entropy scores. Implemented under Bayesian procedures, the proposed model offered a valuable advantage over frequentist methods by further providing the full posterior distribution of the speakers’ potential intelligibility. This provision facilitated the calculation of summaries, aiding individual rankings, and supported the comparisons among selected speakers. In both scenarios, the proposed model accounted for the inherent uncertainty in the intelligibility estimates.

Thirdly, the study illustrated how the proposed model assessed the impact of speaker-related factors on potential intelligibility. The results suggested that multiple models were plausible for the observed entropy scores, indicating that different speaker-related factor theories were viable for the data, with some presenting contradictory conclusions about the influence of those factors on intelligibility. However, even when unequivocal support for one theory was not possible, the divided support among these models informed that certain statistical issues may be hindering the model’s ability to distinguish among individuals and, ultimately, among models. These issues encompassed the insufficient sample size of speakers, the inadequate representation of the population of speakers, and the imprecise measurement of the latent variable of interest.

Ultimately, this study introduced researchers to innovative statistical tools that enhanced existing research models. These tools not only assessed the predictability of empirical phenomena but also quantitatively measured the latent trait of interest, namely potential intelligibility, facilitating the comparison of research theories related to this trait. However, the presented tools introduce new challenges for researchers seeking their implementation. These challenges emerge from two distinct aspects: one methodological and the other practical. In the methodological domain, researchers need familiarity with Bayesian methods and the principled formulation of assumptions regarding the data-generating processes and research inquiries. This entails understanding and addressing each of the data and research challenges within the context of a statistical (probabilistic) model. Conversely, in the practical domain, researchers need familiarity with probabilistic programming languages (PPLs), which are designed for specifying and obtaining inferences from probabilistic models -the core of Bayesian methods. To ensure the successful utilization of this new statistical tool, this study addresses both challenges by providing comprehensive, step-by-step guidance in the form of this digital walk-through document.

9.2 Limitations and future research

This study provides valuable insights into the use of a novel approach to simultaneously address the different data features of entropy scores in speech intelligibility research. However, it is important to acknowledge the limitations of this study and explore potential avenues for future research.

Firstly, the study interprets potential intelligibility as an unobserved latent trait of speakers influencing the likelihood of observing a set of entropy scores. These scores, in turn, reflect the transcribers’ ability to decode words in sentences produced by the same speakers. Despite this practical approach, the construct validity of the latent trait heavily depends on the listeners’ appropriate understanding and execution of the transcription task. Construct validity, as defined by Cronbach and Meehl (1955), refers to the extent to which a set of manifest variables accurately represents a concept that cannot be directly measured. Considering the study assumes the transcription task set by Boonen and colleagues (Boonen et al., 2021) was properly understood and executed, it expects that potential intelligibility reflects the overall speech intelligibility of speakers. However, this study does not delve into the general epistemological considerations regarding the connection between the latent variable and the concept.

Secondly, the study identified a notable absence of unequivocal support for one of the compared models. This deficiency may be attributed to factors such as the insufficient sample size of speakers, the inadequate representation of the populations of speakers (referred to as selection bias), and the imprecise measurement of the latent variable. Insufficient sample size and selection bias yield data with limited outcome and covariates ranges, leading to biased and imprecise parameter estimates (Everitt & Skrondal, 2010). Furthermore, these issues, exacerbated by reduced measurement precision, can result in models with diminished statistical power and a higher risk of type I or type II errors (McElreath, 2020). Consequently, future research should consider conducting power analyses for the proposed models. This entails assessing the impact of expanding the speakers’ pool on testing research theories, or increasing the number of speech samples, transcriptions, and listeners to enhance the precision of potential intelligibility estimates. With these insights, future investigations should contemplate increasing the speaker sample with a group that adequately represents the population of interest. However, this must be done while mindful of the pragmatic limitations associated with transcription tasks, specifically considering the costs and time-intensiveness of the procedure.

Thirdly, the study presented an illustrative example for the investigation of research theories within the model’s framework. However, it did not offer an exhaustive evaluation of all factors influencing intelligibility, which are thoroughly explored in the works of Boons et al. (2012), Fagan et al. (2020), Gillis (2018), and Niparko et al. (2010). Consequently, the study cannot discard the presence of unobservable variables that might bias the parameter estimates, potentially impacting the inferences provided. Hence, future research should consider integrating appropriate causal hypotheses about these factors into the proposed models, as proper covariate adjustment facilitates the production of unbiased and precise parameter estimates (Cinelli, Forney, & Pearl, 2022; Deffner, Rohrer, & McElreath, 2022).

Lastly, this study proposes two directions for future exploration in speech intelligibility research. Firstly, there is an opportunity to investigate alternative methods for assessing speech intelligibility beyond transcription tasks and entropy scores. The experimental design of transcription tasks imply that the procedure may be time-intensive and costly. Thus, exploring less time-intensive or more cost-effective procedures, that still offer comparable precision in intelligibility estimates, could benefit both researchers and speech therapists alike. An illustrative example of such a method is Comparative Judgment (CJ), where judges compare and score the perceived intensity of a trait between two stimuli (Thurstone, 1927). In the context of the intelligibility trait, the stimuli under assessment could be the speech samples uttered by two speakers. Nevertheless, CJ serve as an ideal example as the method has gained increasing attention within the realm of educational assessment, with several studies providing evidence for its validity in assessing various task within student works, as demonstrated by examples in Pollit (2012a, 2012b), Lesterhuis (2018), van Daal (2020) and Verhavert et al. (2019).

Conversely, a second avenue for exploration involves integrating diverse data types and evaluation methods to assess individuals’ intelligibility. This can be accomplished by leveraging two features of Bayesian methods: their flexibility and the concept of Bayesian updating. Bayesian methods possess the flexibility to simultaneously handle various data types. Additionally, through Bayesian updating, researchers can integrate information from the posterior distribution of parameters as priors in models for subsequent evaluations. Ultimately, this could enable researchers to assess speakers’ intelligibility progress without committing to a specific data type or evaluation method. This advancement could mirror the emergence of second-generation Structural Equation Models proposed by Muthen (Muthén, 2001), where models facilitate the combined estimation of categorical and continuous latent variables. However, in the context of future research, the proposal would facilitate the estimation of latent variables using a combination of data types and evaluation methods, contingent upon the fulfillment of construct validity by those evaluation methods.

10 Conclusions

This study highlights the effectiveness of the Bayesian Beta-proportion GLLAMM to collectively address several key data features when investigating unobservable and complex traits, using speech intelligibility and entropy scores as an example. The results demonstrate the proposed model consistently outperforms the Normal LMM in predicting the empirical phenomena. Moreover, it exhibits the ability to quantify the latent potential intelligibility of speakers, allowing for the ranking and comparison of individuals based on the latent trait while accommodating associated uncertainties. Additionally, the proposed model facilitates the exploration of research theories concerning the influence of speaker-related factors on potential intelligibility. The study indicates that integrating and comparing these theories within the model’s framework is a straightforward task. However, the introduction of these innovative statistical tools presents new challenges for researchers seeking implementation. These challenges encompass the principled formulation of assumptions about the data-generating processes and research inquiries, along with the need for familiarity with probabilistic programming languages (PPLs) essential for implementing Bayesian methods. Nevertheless, the study suggests several promising avenues for future research, including power analysis, causal hypothesis formulation, and exploration and integration of novel evaluation methods for assessing intelligibility. The insights derived from this study hold implications for both researchers and data analysts interested in quantitatively measuring and testing theories related to nuanced, unobservable constructs, while also considering the appropriate prediction of the empirical phenomena.

Declarations

Funding: The project was founded through the Research Fund of the University of Antwerp (BOF).

Conflict of interests: The authors declare no conflict of interest.

Ethics approval: This is an observational study. The University of Antwerp Research Ethics Committee has confirmed that no ethical approval is required.

Consent to participate: Not applicable

Consent for publication: All authors have read and agreed to the published version of the manuscript.

Availability of data and materials: The data is delivered upon request, while the user-defined functions are available in the code folder from this walk-through repository

Code availability: The code is available this walk-through.

Authors’ contributions: Conceptualization: S.G., S.dM., and J.M.R.E; Data curation: J.M.R.E.; Formal Analysis: J.M.R.E.; Funding acquisition: S.G. and S.dM; Investigation: S.G.; Methodology: S.G., S.dM., and J.M.R.E; Project administration: S.G. and S.dM.; Resources: S.G. and S.dM.; Software: J.M.R.E.; Supervision: S.G. and S.dM.; Validation: J.M.R.E.; Visualization: J.M.R.E.; Writing - original draft: J.M.R.E.; Writing - review & editing: S.G. and S.dM.

References

Allaire, J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto. https://doi.org/10.5281/zenodo.5960048
Baker, F. (1998). An investigation of the item parameter recovery characteristics of a gibbs sampling procedure. Applied Psychological Measurement, 22(22), 153–169. https://doi.org/10.1177/01466216980222005
Baldwin, S., & Fellingham, G. (2013). Bayesian methods for the analysis of small sample multilevel data with a complex variance structure. Journal of Psychological Methods, 18(2), 151–164. https://doi.org/10.1037/a0030642
Boonen, N., Kloots, H., Nurzia, P., & Gillis, S. (2021). Spontaneous speech intelligibility: Early cochlear implanted children versus their normally hearing peers at seven years of age. Journal of Child Language, 1–26. https://doi.org/10.1017/S0305000921000714
Boons, T., Brokx, J., Dhooge, I., Frijns, J., Peeraer, L., Vermeulen, A., … van Wieringen, A. (2012). Predictors of spoken language development following pediatric cochlear implantation. Ear and Hearing, 33(5), 617–639. https://doi.org/10.1097/AUD.0b013e3182503e47
Brito Trindade, P. L. A. P. V. de, Daniele AND Espinheira. (2021). Beta regression model nonlinear in the parameters with additive measurement errors in variables. PLOS ONE, 16(7), 1–28. https://doi.org/10.1371/journal.pone.0254103
Brooks, S., Gelman, A., Jones, G., & Meng, X. (2011). Handbook of markov chain monte carlo (1st ed.). Chapman; Hall, CRC. https://doi.org/10.1201/b10905
Chin, S., Bergeson, T., & Phan, J. (2012). Speech intelligibility and prosody production in children with cochlear implants. Journal of Communication Disorders, 45, 355–366. https://doi.org/10.1016/j.jcomdis.2012.05.003
Cinelli, C., Forney, A., & Pearl, J. (2022). A crash course in good and bad controls. SSRN. https://doi.org/http://dx.doi.org/10.2139/ssrn.3689437
Cronbach, L., & Meehl, P. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. https://doi.org/10.1037/h0040957
Deffner, D., Rohrer, J., & McElreath, R. (2022). A causal framework for cross-cultural generalizability. Advances in Methods and Practices in Psychological Science, 5(3). https://doi.org/10.1177/25152459221106366
Denwood, M. (2016). runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. Journal of Statistical Software, 71(9), 1–25. https://doi.org/10.18637/jss.v071.i09
Depaoli, S. (2014). The impact of inaccurate “informative” priors for growth parameters in bayesian growth mixture modeling. Journal of Structural Equation Modeling, 21, 239–252. https://doi.org/10.1080/10705511.2014.882686
Depaoli, S. (2021). Bayesian structural equation modeling. The Guilford Press.
Depaoli, S., & van de Schoot, R. (2017). Improving transparency and replication in bayesian statistics: The WAMBS-checklist. Psychological Methods, 22(2), 240–261. https://doi.org/10.1037/met0000065
Everitt, B., & Skrondal, A. (2010). The cambridge dictionary of statistics. Cambridge University Press.
Faes, J., De Maeyer, S., & Gillis, S. (2021). Speech intelligibility of children with an auditory brainstem implant: A triple-case study. 1–50.
Fagan, M., Eisenberg, L., & Johnson, K. (2020). Investigating early pre-implant predictors of language and cognitive development in children with cochlear implants. In M. Marschark & H. Knoors (Eds.), Oxford handbook of deaf studies in learning and cognition (pp. 46–95). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190054045.013.3
Flipsen, P. (2006). Measuring the intelligibility of conversational speech in children. Clinical Linguistics & Phonetics, 20(4), 303–312. https://doi.org/10.1080/02699200400024863
Freeman, V., Pisoni, D., Kronenberger, W., & Castellanos, I. (2017). Speech intelligibility and psychosocial functioning in deaf children and teens with cochlear implants. Journal of Deaf Studies and Deaf Education, 22(3), 278–289. https://doi.org/10.1093/deafed/enx001
Gabry, J., & Češnovar, R. (2022). Cmdstanr: R interface to ’CmdStan’.
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., & Rubin, D. (2014). Bayesian data analysis (3rd ed.). Chapman; Hall/CRC.
Gillis, S. (2018). Speech and language in congenitally deaf children with a cochlear implant. In E. Dattner & D. Ravid (Eds.), Handbook of communication disorders: Theoretical, empirical, and applied linguistic perspectives (pp. 765–792). De Gruyter Mouton. https://doi.org/10.1515/9781614514909-038
Gorinova, M., Moore, D., & Hoffman, M. (2019). Automatic reparameterisation of probabilistic programs. Retrieved from https://doi.org/10.48550/arXiv.1906.03028
Holmes, W., Bolin, J., & Kelley, K. (2019). Multilevel modeling using r (2nd edition). Chapman; Hall/CRC. https://doi.org/10.1201/9781351062268
Hoyle, R. (eds. ). (2014). Handbook of structural equation modeling. Guilford Press.
Jeffreys, H. (1998). Theory of probability. Oxford University Press.
Kangmennaang, J., Siiba, A., & Bisung, E. (2023). Does trust mediate the relationship between experiences of discrimination and health care access and utilization among minoritized canadians during COVID-19 pandemic? Journal of Racial and Ethnic Health Disparities. https://doi.org/10.1007/s40615-023-01809-w
Kent, R. D., Miolo, G., & Bloedel, S. (19943). The intelligibility of children’s speech: A review of evaluation procedures. American Journal of Speech-Language Pathology, 3(2), 81–95. https://doi.org/10.1044/1058-0360.0302.81
Kim, S., & Cohen, A. (1999). Accuracy of parameter estimation in gibbs sampling under the two-parameter logistic model. Annual Meeting of the American Educational Research Association. American Educational Research Association. Retrieved from https://eric.ed.gov/?id=ED430012
Kruschke, D. (2015). Doing bayesian data analysis: A tutorial with r, JAGS, and stan. Elsevier. Retrieved from https://www.sciencedirect.com/book/9780124058880/doing-bayesian-data-analysis
Kullback, S., & Leibler, R. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86. Retrieved from http://www.jstor.org/stable/2236703
Lagerberg, T., Asberg, J., Hartelius, L., & Persson, C. (2014). Assessment of intelligibility using children’s spontaneous speech: Methodological aspects. International Journal of Language and Communication Disorders, 49(2), 228–239. https://doi.org/10.1111/1460-6984.12067
Lambert, P., Sutton, A., Burton, P., Abrams, K., & Jones, D. (2006). How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS. Journal of Statistics in Medicine, 24(15), 2401–2428. https://doi.org/10.1002/sim.2112
Lebl, J. (2022). Basic analysis i & II: Introduction to real analysis, volumes i & II. Retrieved from https://www.jirka.org/ra/html/frontmatter-1.html
Lee, Y., & Nelder, J. A. (1996). Hierarchical generalized linear models. Journal of the Royal Statistical Society: Series B (Methodological), 58(4), 619–656. https://doi.org/10.1111/j.2517-6161.1996.tb02105.x
Lesterhuis, M. (2018). The validity of comparative judgement for assessing text quality: An assessor’s perspective (PhD thesis). University of Antwerp.
Luce, R. (1959). On the possible psychophysical laws. The Psychologcal Review, 66(2), 482–499. https://doi.org/10.1037/h0043178
MacWhinney, B. (2020). The CHILDES project: Tools for analyzing talk. Lawrence Erlbaum Associates. https://doi.org/10.21415/3mhn-0z89
Martin, J., & McDonald, R. (1975). Bayesian estimation in unrestricted factor analysis: A treatment for heywood cases. Psychometrika, (40), 505–517. https://doi.org/10.1007/BF02291552
Mayer, M. (1969). Frog, where are you? Dial Books for Young Readers. Retrieved from https://books.google.be/books?id=Asi5KQAACAAJ
McElreath, R. (2020). Statistical rethinking: A bayesian course with examples in r and STAN. Chapman; Hall/CRC.
McElreath, R. (2021). Rethinking: Statistical rethinking book package.
Muthén, B. (2001). Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class–latent growth modeling. In L. Collins & A. Sayer (Eds.), New methods for the analysis of change (pp. 291–322). American Psychological Association. https://doi.org/10.1037/10409-010
Neal, R. (2003). Slice sampling. The Annals of Statistics, 31(3), 705–741. https://doi.org/)
Neuwirth, E. (2022). RColorBrewer: ColorBrewer palettes. Retrieved from https://CRAN.R-project.org/package=RColorBrewer
Niparko, J., Tobey, E., Thal, D., Eisenberg, L., Wang, N., Quittner, A., & Fink, N. (2010). Spoken language development in children following cochlear implantation. JAMA, 303(15), 1498–1506. https://doi.org/10.1001/jama.2010.451
Plummer, M., Best, N., Cowles, K., & Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R News, 6(1), 7–11. Retrieved from https://journal.r-project.org/archive/
Pollitt, A. (2012a). Comparative judgement for assessment. International Journal of Technology and Design Education, 22(2), 157--170. https://doi.org/10.1007/s10798-011-9189-x
Pollitt, A. (2012b). The method of adaptive comparative judgement. Assessment in Education: Principles, Policy and Practice, 19(3), 281--300. https://doi.org/10.1080/0969594X.2012.665354
R Core Team. (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org/
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized multilevel structural equation modeling. Psychometrika, 69(2), 167–190. https://doi.org/https://www.doi.org/10.1007/BF02295939
Seaman, S. jr., J., & Stamey, J. (2011). Hidden dangers of specifying noninformative priors. The American Statistician, 66(2), 77–84. https://doi.org/10.1080/00031305.2012.695938
Shannon, C. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Shmueli, G., & Koppius, O. (2011). Predictive analytics in information systems research. MIS Quarterly, 35(3), 553–572. https://doi.org/10.2307/23042796
Spiegelhalter, D., Best, N., Carlin, B., & van der Linde, A. (2002). Bayesian Measures of Model Complexity and Fit. Journal of the Royal Statistical Society Series B: Statistical Methodology, 64(4), 583–639. https://doi.org/10.1111/1467-9868.00353
Stan Development Team. (2020). RStan: The R interface to Stan. Retrieved from http://mc-stan.org/
Stan Development Team. (2021). Stan modeling language users guide and reference manual, version 2.26. Vienna, Austria. Retrieved from https://mc-stan.org
Tackney, M., Morris, T., White, I., Leyrat, C., Diaz-Ordaz, K., & Williamson, E. (2023). A comparison of covariate adjustment approaches under model misspecification in individually randomized trials. Trials, 24(14). https://doi.org/10.1186/s13063-022-06967-6
Thurstone, L. (1927). A law of comparative judgment. Psychological Review, 34(4), 482–499. https://doi.org/10.1037/h0070288
van Daal, T. (2020). Making a choice is not easy?!: Unravelling the task difficulty of comparative judgement to assess student work (PhD thesis). University of Antwerp.
van Heuven, V. (2008). Making sense of strange sounds: (Mutual) intelligibility of related language varieties. A review. International Journal of Humanities and Arts Computing, 2(1-2), 39–62. https://doi.org/10.3366/E1753854809000305
Vehtari, A., Gabry, J., Magnusson, M., Yao, Y., Bürkner, P., Paananen, T., & Gelman, A. (2023). Loo: Efficient leave-one-out cross-validation and WAIC for bayesian models. Retrieved from https://mc-stan.org/loo/
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, PC. (2021a). Rank-Normalization, Folding, and Localization: An Improved \widehat{R} for Assessing Convergence of MCMC (with Discussion). Bayesian Analysis, 16(2), 667–718. https://doi.org/10.1214/20-BA1221
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., & Gabry, J. (2021b). Pareto smoothed importance sampling. Retrieved from https://arxiv.org/abs/1507.02646
Verhavert, S., Bouwer, R., Donche, V., & De Maeyer, S. (2019). A meta-analysis on the reliability of comparative judgement. Assessment in Education: Principles, Policy and Practice, 26(5), 541–562. https://doi.org/10.1080/0969594X.2019.1602027
Watanabe, S. (2013). A widely applicable bayesian information criterion. Journal of Machine Learning Research, 14, 867–897. Retrieved from https://www.jmlr.org/papers/volume14/watanabe13a/watanabe13a.pdf
Whitehill, T., & Chau, C. (2004). Single-word intelligibility in speakers with repaired cleft palate. Clinical Linguistics and Phonetics, 18, 341–355. https://doi.org/10.1080/02699200410001663344
Wickham, H. (2007). Reshaping data with the reshape package. Journal of Statistical Software, 21(12), 1–20. Retrieved from http://www.jstatsoft.org/v21/i12/
Wickham, H. (2022). Stringr: Simple, consistent wrappers for common string operations. Retrieved from https://CRAN.R-project.org/package=stringr
Wickham, Hadley, Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, Hadley, François, R., Henry, L., Müller, K., & Vaughan, D. (2023). Dplyr: A grammar of data manipulation. Retrieved from https://CRAN.R-project.org/package=dplyr

Footnotes

  1. For a thorough explanation of the Bayesian inferences procedures the reader can refer to Kruschke (2015) or McElreath (2020).↩︎

  2. The reader can refer to Brooks et al. (2011) for a detailed treatment on MCMC methods.↩︎

  3. An interested reader can further refer to McElreath (2020) for a detailed explanation of grid approximation.↩︎

  4. An interested reader can refer to McElreath (2020), Gorinova et al. (2019) and Neal (2003)↩︎